Sync to linguist 7.2.0: heuristics.yml support (#189)

Sync \w Github Linguist v7.2.0

Includes new way of handling `heuristics.yml` and
all `./data/*` re-generated using Github Linguist [v7.2.0](https://github.com/github/linguist/releases/tag/v7.2.0)
release tag.

 - many new languages
 - better vendoring detection
 - update doc on update&known issues.
This commit is contained in:
Alexander 2019-02-14 12:47:45 +01:00 committed by GitHub
parent 13d3d66d37
commit 3499750785
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
45 changed files with 105155 additions and 74316 deletions

61
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,61 @@
# source{d} Contributing Guidelines
source{d} projects accept contributions via GitHub pull requests.
This document outlines some of the
conventions on development workflow, commit message formatting, contact points,
and other resources to make it easier to get your contribution accepted.
## Certificate of Origin
By contributing to this project, you agree to the [Developer Certificate of
Origin (DCO)](DCO). This document was created by the Linux Kernel community and is a
simple statement that you, as a contributor, have the legal right to make the
contribution.
In order to show your agreement with the DCO you should include at the end of the commit message,
the following line: `Signed-off-by: John Doe <john.doe@example.com>`, using your real name.
This can be done easily using the [`-s`](https://github.com/git/git/blob/b2c150d3aa82f6583b9aadfecc5f8fa1c74aca09/Documentation/git-commit.txt#L154-L161) flag on the `git commit`.
If you find yourself pushed a few commits without `Signed-off-by`, you can still add it afterwards. We wrote a manual which can help: [fix-DCO.md](https://github.com/src-d/guide/blob/master/developer-community/fix-DCO.md).
## Support Channels
The official support channels, for both users and contributors, are:
- GitHub issues: each repository has its own list of issues.
- Slack: join the [source{d} Slack](https://join.slack.com/t/sourced-community/shared_invite/enQtMjc4Njk5MzEyNzM2LTFjNzY4NjEwZGEwMzRiNTM4MzRlMzQ4MmIzZjkwZmZlM2NjODUxZmJjNDI1OTcxNDAyMmZlNmFjODZlNTg0YWM) community.
*Before opening a new issue or submitting a new pull request, it's helpful to
search the project - it's likely that another user has already reported the
issue you're facing, or it's a known issue that we're already aware of.
## How to Contribute
Pull Requests (PRs) are the main and exclusive way to contribute code to source{d} projects.
In order for a PR to be accepted it needs to pass this list of requirements:
- The contribution must be correctly explained with natural language and providing a minimum working example that reproduces it.
- All PRs must be written idiomaticly:
- for Go: formatted according to [gofmt](https://golang.org/cmd/gofmt/), and without any warnings from [go lint](https://github.com/golang/lint) nor [go vet](https://golang.org/cmd/vet/)
- for other languages, similar constraints apply.
- They should in general include tests, and those shall pass.
- If the PR is a bug fix, it has to include a new unit test that fails before the patch is merged.
- If the PR is a new feature, it has to come with a suite of unit tests, that tests the new functionality.
- In any case, all the PRs have to pass the personal evaluation of at least one of the [maintainers](MAINTAINERS) of the project.
### Format of the commit message
Every commit message should describe what was changed, under which context and, if applicable, the GitHub issue it relates to:
```
plumbing: packp, Skip argument validations for unknown capabilities. Fixes #623
```
The format can be described more formally as follows:
```
<package>: <subpackage>, <what changed>. [Fixes #<issue-number>]
```

View File

@ -46,6 +46,11 @@ clean: clean-linguist clean-shared
code-generate: $(LINGUIST_PATH)
mkdir -p data && \
go run internal/code-generator/main.go
ENRY_TEST_REPO="$${PWD}/.linguist" go test -v \
-run Test_GeneratorTestSuite \
./internal/code-generator/generator \
-testify.m TestUpdateGeneratorTestSuiteGold \
-update_gold
benchmarks: $(LINGUIST_PATH)
go test -run=NONE -bench=. && \

View File

@ -154,14 +154,17 @@ Generated Java bindings using a C-shared library and JNI are located under [`jav
Development
------------
*enry* re-uses parts of original [linguist](https://github.com/github/linguist) to generate internal data structures. In order to update to the latest upstream and generate the necessary code you must run:
*enry* re-uses parts of original [linguist](https://github.com/github/linguist) to generate internal data structures. In order to update to the latest upstream and generate all the necessary code you must run:
git clone https://github.com/github/linguist.git .linguist
# update commit in generator_test.go (to re-generate .gold fixtures)
# https://github.com/src-d/enry/blob/13d3d66d37a87f23a013246a1b0678c9ee3d524b/internal/code-generator/generator/generator_test.go#L18
go generate
We update enry when changes are done in linguist's master branch on the following files:
* [languages.yml](https://github.com/github/linguist/blob/master/lib/linguist/languages.yml)
* [heuristics.rb](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb)
* [heuristics.yml](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.yml)
* [vendor.yml](https://github.com/github/linguist/blob/master/lib/linguist/vendor.yml)
* [documentation.yml](https://github.com/github/linguist/blob/master/lib/linguist/documentation.yml)
@ -183,17 +186,11 @@ Divergences from linguist
Using [linguist/samples](https://github.com/github/linguist/tree/master/samples)
as a set for the tests, the following issues were found:
* With [hello.ms](https://github.com/github/linguist/blob/master/samples/Unix%20Assembly/hello.ms) we can't detect the language (Unix Assembly) because we don't have a matcher in contentMatchers (content.go) for Unix Assembly. Linguist uses this [regexp](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb#L300) in its code,
* [Heuristics for ".es" extension](https://github.com/github/linguist/blob/e761f9b013e5b61161481fcb898b59721ee40e3d/lib/linguist/heuristics.yml#L103) in JavaScript could not be parsed, due to unsupported backreference in RE2 regexp engine
`elsif /(?<!\S)\.(include|globa?l)\s/.match(data) || /(?<!\/\*)(\A|\n)\s*\.[A-Za-z][_A-Za-z0-9]*:/.match(data.gsub(/"([^\\"]|\\.)*"|'([^\\']|\\.)*'|\\\s*(?:--.*)?\n/, ""))`
* As of (Linguist v5.3.2)[https://github.com/github/linguist/releases/tag/v5.3.2] it is using [flex-based scanner in C for tokenization](https://github.com/github/linguist/pull/3846). Enry stil uses [extract_token](https://github.com/github/linguist/pull/3846/files#diff-d5179df0b71620e3fac4535cd1368d15L60) regex-based algorithm. Tracked under https://github.com/src-d/enry/issues/193
which we can't port.
* All files for the SQL language fall to the classifier because we don't parse
this [disambiguator
expression](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb#L433)
for `*.sql` files right. This expression doesn't comply with the pattern for the
rest in [heuristics.rb](https://github.com/github/linguist/blob/master/lib/linguist/heuristics.rb).
* Bayesian classifier cann't distinguish "SQL" vs "PLpgSQL". Tracked under https://github.com/src-d/enry/issues/194
`enry` [CLI tool](#cli) does not require a full Git repository to be present in filesystem in order to report languages.
@ -232,7 +229,7 @@ As benchmarks depend on Ruby and Github-Linguist gem make sure you have:
If you want to reproduce the same benchmarks as reported above:
- Make sure all [dependencies](#benchmark-dependencies) are installed
- Install [gnuplot](http://gnuplot.info) (in order to plot the histogram)
- Run `ENRY_TEST_REPO=.linguist benchmarks/run.sh` (takes ~15h)
- Run `ENRY_TEST_REPO="$PWD/.linguist" benchmarks/run.sh` (takes ~15h)
It will run the benchmarks for enry and linguist, parse the output, create csv files and plot the histogram. This takes some time.

View File

@ -28,9 +28,6 @@ var (
)
func TestMain(m *testing.M) {
var exitCode int
defer os.Exit(exitCode)
flag.BoolVar(&slow, "slow", false, "run benchmarks per sample for strategies too")
flag.Parse()
@ -47,7 +44,7 @@ func TestMain(m *testing.M) {
log.Fatal(err)
}
exitCode = m.Run()
os.Exit(m.Run())
}
func cloneLinguist(linguistURL string) error {

View File

@ -16,7 +16,7 @@ const OtherLanguage = ""
// Strategy type fix the signature for the functions that can be used as a strategy.
type Strategy func(filename string, content []byte, candidates []string) (languages []string)
// DefaultStrategies is the strategies' sequence GetLanguage uses to detect languages.
// DefaultStrategies is a sequence of strategies used by GetLanguage to detect languages.
var DefaultStrategies = []Strategy{
GetLanguagesByModeline,
GetLanguagesByFilename,
@ -397,12 +397,13 @@ func GetLanguagesByContent(filename string, content []byte, _ []string) []string
}
ext := strings.ToLower(filepath.Ext(filename))
fnMatcher, ok := data.ContentMatchers[ext]
heuristic, ok := data.ContentHeuristics[ext]
if !ok {
return nil
}
return fnMatcher(content)
return heuristic.Match(content)
}
// GetLanguagesByClassifier uses DefaultClassifier as a Classifier and returns a sorted slice of possible languages ordered by
@ -455,9 +456,7 @@ func GetLanguageType(language string) (langType Type) {
// GetLanguageByAlias returns either the language related to the given alias and ok set to true
// or Otherlanguage and ok set to false if the alias is not recognized.
func GetLanguageByAlias(alias string) (lang string, ok bool) {
a := strings.Split(alias, `,`)[0]
a = strings.ToLower(a)
lang, ok = data.LanguagesByAlias[a]
lang, ok = data.LanguageByAlias(alias)
if !ok {
lang = OtherLanguage
}

View File

@ -11,6 +11,7 @@ import (
"gopkg.in/src-d/enry.v1/data"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"github.com/stretchr/testify/suite"
)
@ -19,9 +20,36 @@ const linguistClonedEnvVar = "ENRY_TEST_REPO"
type EnryTestSuite struct {
suite.Suite
repoLinguist string
samplesDir string
cloned bool
tmpLinguist string
needToClone bool
samplesDir string
}
func (s *EnryTestSuite) TestRegexpEdgeCases() {
var regexpEdgeCases = []struct {
lang string
filename string
}{
{lang: "ActionScript", filename: "FooBar.as"},
{lang: "Forth", filename: "asm.fr"},
{lang: "X PixMap", filename: "cc-public_domain_mark_white.pm"},
//{lang: "SQL", filename: "drop_stuff.sql"}, // https://github.com/src-d/enry/issues/194
{lang: "Fstar", filename: "Hacl.Spec.Bignum.Fmul.fst"},
{lang: "C++", filename: "Types.h"},
}
for _, r := range regexpEdgeCases {
filename := fmt.Sprintf("%s/samples/%s/%s", s.tmpLinguist, r.lang, r.filename)
content, err := ioutil.ReadFile(filename)
require.NoError(s.T(), err)
lang := GetLanguage(r.filename, content)
s.T().Logf("File:%s, lang:%s", filename, lang)
expLang, _ := data.LanguageByAlias(r.lang)
require.EqualValues(s.T(), expLang, lang)
}
}
func Test_EnryTestSuite(t *testing.T) {
@ -30,25 +58,24 @@ func Test_EnryTestSuite(t *testing.T) {
func (s *EnryTestSuite) SetupSuite() {
var err error
s.repoLinguist = os.Getenv(linguistClonedEnvVar)
s.cloned = s.repoLinguist == ""
if s.cloned {
s.repoLinguist, err = ioutil.TempDir("", "linguist-")
assert.NoError(s.T(), err)
}
s.samplesDir = filepath.Join(s.repoLinguist, "samples")
if s.cloned {
cmd := exec.Command("git", "clone", linguistURL, s.repoLinguist)
s.tmpLinguist = os.Getenv(linguistClonedEnvVar)
s.needToClone = s.tmpLinguist == ""
if s.needToClone {
s.tmpLinguist, err = ioutil.TempDir("", "linguist-")
require.NoError(s.T(), err)
s.T().Logf("Cloning Linguist repo to '%s' as %s was not set\n",
s.tmpLinguist, linguistClonedEnvVar)
cmd := exec.Command("git", "clone", linguistURL, s.tmpLinguist)
err = cmd.Run()
assert.NoError(s.T(), err)
require.NoError(s.T(), err)
}
s.samplesDir = filepath.Join(s.tmpLinguist, "samples")
s.T().Logf("using samples from %s", s.samplesDir)
cwd, err := os.Getwd()
assert.NoError(s.T(), err)
err = os.Chdir(s.repoLinguist)
err = os.Chdir(s.tmpLinguist)
assert.NoError(s.T(), err)
cmd := exec.Command("git", "checkout", data.LinguistCommit)
@ -60,8 +87,8 @@ func (s *EnryTestSuite) SetupSuite() {
}
func (s *EnryTestSuite) TearDownSuite() {
if s.cloned {
err := os.RemoveAll(s.repoLinguist)
if s.needToClone {
err := os.RemoveAll(s.tmpLinguist)
assert.NoError(s.T(), err)
}
}
@ -88,7 +115,7 @@ func (s *EnryTestSuite) TestGetLanguage() {
}
func (s *EnryTestSuite) TestGetLanguagesByModelineLinguist() {
var modelinesDir = filepath.Join(s.repoLinguist, "test/fixtures/Data/Modelines")
var modelinesDir = filepath.Join(s.tmpLinguist, "test/fixtures/Data/Modelines")
tests := []struct {
name string
@ -400,7 +427,8 @@ func (s *EnryTestSuite) TestGetLanguageByAlias() {
func (s *EnryTestSuite) TestLinguistCorpus() {
const filenamesDir = "filenames"
var cornerCases = map[string]bool{
"hello.ms": true,
"drop_stuff.sql": true, // https://github.com/src-d/enry/issues/194
// .es and .ice fail heuristics parsing, but do not fail any tests
}
var total, failed, ok, other int
@ -408,7 +436,7 @@ func (s *EnryTestSuite) TestLinguistCorpus() {
filepath.Walk(s.samplesDir, func(path string, f os.FileInfo, err error) error {
if f.IsDir() {
if f.Name() != filenamesDir {
expected = f.Name()
expected, _ = data.LanguageByAlias(f.Name())
}
return nil
@ -431,11 +459,10 @@ func (s *EnryTestSuite) TestLinguistCorpus() {
} else {
status = "failed"
failed++
}
if _, ok := cornerCases[filename]; ok {
fmt.Printf("\t\t[considered corner case] %s\texpected: %s\tobtained: %s\tstatus: %s\n", filename, expected, obtained, status)
s.T().Logf("\t\t[considered corner case] %s\texpected: %s\tobtained: %s\tstatus: %s\n", filename, expected, obtained, status)
} else {
assert.Equal(s.T(), expected, obtained, fmt.Sprintf("%s\texpected: %s\tobtained: %s\tstatus: %s\n", filename, expected, obtained, status))
}
@ -443,5 +470,5 @@ func (s *EnryTestSuite) TestLinguistCorpus() {
return nil
})
fmt.Printf("\t\ttotal files: %d, ok: %d, failed: %d, other: %d\n", total, ok, failed, other)
s.T().Logf("\t\ttotal files: %d, ok: %d, failed: %d, other: %d\n", total, ok, failed, other)
}

View File

@ -1,11 +1,13 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
// LanguagesByAlias keeps alias for different languages and use the name of the languages as an alias too.
import "strings"
// LanguageByAliasMap keeps alias for different languages and use the name of the languages as an alias too.
// All the keys (alias or not) are written in lower case and the whitespaces has been replaced by underscores.
var LanguagesByAlias = map[string]string{
var LanguageByAliasMap = map[string]string{
"1c_enterprise": "1C Enterprise",
"abap": "ABAP",
"abl": "OpenEdge ABL",
@ -32,6 +34,7 @@ var LanguagesByAlias = map[string]string{
"alpine_abuild": "Alpine Abuild",
"amfm": "Adobe Font Metrics",
"ampl": "AMPL",
"angelscript": "AngelScript",
"ant_build_system": "Ant Build System",
"antlr": "ANTLR",
"apache": "ApacheConf",
@ -40,79 +43,81 @@ var LanguagesByAlias = map[string]string{
"api_blueprint": "API Blueprint",
"apkbuild": "Alpine Abuild",
"apl": "APL",
"apollo_guidance_computer": "Apollo Guidance Computer",
"applescript": "AppleScript",
"arc": "Arc",
"arduino": "Arduino",
"arexx": "REXX",
"as3": "ActionScript",
"asciidoc": "AsciiDoc",
"asn.1": "ASN.1",
"asp": "ASP",
"aspectj": "AspectJ",
"aspx": "ASP",
"aspx-vb": "ASP",
"assembly": "Assembly",
"ats": "ATS",
"ats2": "ATS",
"au3": "AutoIt",
"augeas": "Augeas",
"autoconf": "M4Sugar",
"autohotkey": "AutoHotkey",
"autoit": "AutoIt",
"autoit3": "AutoIt",
"autoitscript": "AutoIt",
"awk": "Awk",
"b3d": "BlitzBasic",
"ballerina": "Ballerina",
"bash": "Shell",
"bash_session": "ShellSession",
"bat": "Batchfile",
"batch": "Batchfile",
"batchfile": "Batchfile",
"befunge": "Befunge",
"bison": "Bison",
"bitbake": "BitBake",
"blade": "Blade",
"blitz3d": "BlitzBasic",
"blitzbasic": "BlitzBasic",
"blitzmax": "BlitzMax",
"blitzplus": "BlitzBasic",
"bluespec": "Bluespec",
"bmax": "BlitzMax",
"boo": "Boo",
"bplus": "BlitzBasic",
"brainfuck": "Brainfuck",
"brightscript": "Brightscript",
"bro": "Bro",
"bsdmake": "Makefile",
"byond": "DM",
"c": "C",
"c#": "C#",
"c++": "C++",
"c++-objdump": "Cpp-ObjDump",
"c-objdump": "C-ObjDump",
"c2hs": "C2hs Haskell",
"c2hs_haskell": "C2hs Haskell",
"cap'n_proto": "Cap'n Proto",
"carto": "CartoCSS",
"cartocss": "CartoCSS",
"ceylon": "Ceylon",
"cfc": "ColdFusion CFC",
"cfm": "ColdFusion",
"cfml": "ColdFusion",
"chapel": "Chapel",
"charity": "Charity",
"chpl": "Chapel",
"chuck": "ChucK",
"cirru": "Cirru",
"clarion": "Clarion",
"clean": "Clean",
"click": "Click",
"clipper": "xBase",
"clips": "CLIPS",
"clojure": "Clojure",
"closure_templates": "Closure Templates",
"apollo_guidance_computer": "Apollo Guidance Computer",
"applescript": "AppleScript",
"arc": "Arc",
"arexx": "REXX",
"as3": "ActionScript",
"asciidoc": "AsciiDoc",
"asm": "Assembly",
"asn.1": "ASN.1",
"asp": "ASP",
"aspectj": "AspectJ",
"aspx": "ASP",
"aspx-vb": "ASP",
"assembly": "Assembly",
"asymptote": "Asymptote",
"ats": "ATS",
"ats2": "ATS",
"au3": "AutoIt",
"augeas": "Augeas",
"autoconf": "M4Sugar",
"autohotkey": "AutoHotkey",
"autoit": "AutoIt",
"autoit3": "AutoIt",
"autoitscript": "AutoIt",
"awk": "Awk",
"b3d": "BlitzBasic",
"ballerina": "Ballerina",
"bash": "Shell",
"bash_session": "ShellSession",
"bat": "Batchfile",
"batch": "Batchfile",
"batchfile": "Batchfile",
"befunge": "Befunge",
"bison": "Bison",
"bitbake": "BitBake",
"blade": "Blade",
"blitz3d": "BlitzBasic",
"blitzbasic": "BlitzBasic",
"blitzmax": "BlitzMax",
"blitzplus": "BlitzBasic",
"bluespec": "Bluespec",
"bmax": "BlitzMax",
"boo": "Boo",
"bplus": "BlitzBasic",
"brainfuck": "Brainfuck",
"brightscript": "Brightscript",
"bro": "Bro",
"bsdmake": "Makefile",
"byond": "DM",
"c": "C",
"c#": "C#",
"c++": "C++",
"c++-objdump": "Cpp-ObjDump",
"c-objdump": "C-ObjDump",
"c2hs": "C2hs Haskell",
"c2hs_haskell": "C2hs Haskell",
"cap'n_proto": "Cap'n Proto",
"carto": "CartoCSS",
"cartocss": "CartoCSS",
"ceylon": "Ceylon",
"cfc": "ColdFusion CFC",
"cfm": "ColdFusion",
"cfml": "ColdFusion",
"chapel": "Chapel",
"charity": "Charity",
"chpl": "Chapel",
"chuck": "ChucK",
"cirru": "Cirru",
"clarion": "Clarion",
"clean": "Clean",
"click": "Click",
"clipper": "xBase",
"clips": "CLIPS",
"clojure": "Clojure",
"closure_templates": "Closure Templates",
"cloud_firestore_security_rules": "Cloud Firestore Security Rules",
"cmake": "CMake",
"cobol": "COBOL",
"coffee": "CoffeeScript",
@ -123,10 +128,15 @@ var LanguagesByAlias = map[string]string{
"coldfusion_html": "ColdFusion",
"collada": "COLLADA",
"common_lisp": "Common Lisp",
"common_workflow_language": "Common Workflow Language",
"component_pascal": "Component Pascal",
"conll": "CoNLL-U",
"conll-u": "CoNLL-U",
"conll-x": "CoNLL-U",
"console": "ShellSession",
"cool": "Cool",
"coq": "Coq",
"cperl": "Perl",
"cpp": "C++",
"cpp-objdump": "Cpp-ObjDump",
"creole": "Creole",
@ -144,6 +154,7 @@ var LanguagesByAlias = map[string]string{
"cucumber": "Gherkin",
"cuda": "Cuda",
"cweb": "CWeb",
"cwl": "Common Workflow Language",
"cycript": "Cycript",
"cython": "Cython",
"d": "D",
@ -176,55 +187,69 @@ var LanguagesByAlias = map[string]string{
"ecl": "ECL",
"eclipse": "ECLiPSe",
"ecr": "HTML+ECR",
"edn": "edn",
"eeschema_schematic": "KiCad Schematic",
"eex": "HTML+EEX",
"eiffel": "Eiffel",
"ejs": "EJS",
"elisp": "Emacs Lisp",
"elixir": "Elixir",
"elm": "Elm",
"emacs": "Emacs Lisp",
"emacs_lisp": "Emacs Lisp",
"emberscript": "EmberScript",
"eq": "EQ",
"erb": "HTML+ERB",
"erlang": "Erlang",
"f#": "F#",
"factor": "Factor",
"fancy": "Fancy",
"fantom": "Fantom",
"filebench_wml": "Filebench WML",
"filterscript": "Filterscript",
"fish": "fish",
"flex": "Lex",
"flux": "FLUX",
"formatted": "Formatted",
"forth": "Forth",
"fortran": "Fortran",
"foxpro": "xBase",
"freemarker": "FreeMarker",
"frege": "Frege",
"fsharp": "F#",
"ftl": "FreeMarker",
"fundamental": "Text",
"g-code": "G-code",
"game_maker_language": "Game Maker Language",
"gams": "GAMS",
"gap": "GAP",
"edje_data_collection": "Edje Data Collection",
"edn": "edn",
"eeschema_schematic": "KiCad Schematic",
"eex": "HTML+EEX",
"eiffel": "Eiffel",
"ejs": "EJS",
"elisp": "Emacs Lisp",
"elixir": "Elixir",
"elm": "Elm",
"emacs": "Emacs Lisp",
"emacs_lisp": "Emacs Lisp",
"emberscript": "EmberScript",
"eml": "EML",
"eq": "EQ",
"erb": "HTML+ERB",
"erlang": "Erlang",
"f#": "F#",
"f*": "F*",
"factor": "Factor",
"fancy": "Fancy",
"fantom": "Fantom",
"figfont": "FIGlet Font",
"figlet_font": "FIGlet Font",
"filebench_wml": "Filebench WML",
"filterscript": "Filterscript",
"fish": "fish",
"flex": "Lex",
"flux": "FLUX",
"formatted": "Formatted",
"forth": "Forth",
"fortran": "Fortran",
"foxpro": "xBase",
"freemarker": "FreeMarker",
"frege": "Frege",
"fsharp": "F#",
"fstar": "F*",
"ftl": "FreeMarker",
"fundamental": "Text",
"g-code": "G-code",
"game_maker_language": "Game Maker Language",
"gams": "GAMS",
"gap": "GAP",
"gcc_machine_description": "GCC Machine Description",
"gdb": "GDB",
"gdscript": "GDScript",
"genie": "Genie",
"genshi": "Genshi",
"gentoo_ebuild": "Gentoo Ebuild",
"gentoo_eclass": "Gentoo Eclass",
"gerber_image": "Gerber Image",
"gettext_catalog": "Gettext Catalog",
"gf": "Grammatical Framework",
"gherkin": "Gherkin",
"glsl": "GLSL",
"glyph": "Glyph",
"gdb": "GDB",
"gdscript": "GDScript",
"genie": "Genie",
"genshi": "Genshi",
"gentoo_ebuild": "Gentoo Ebuild",
"gentoo_eclass": "Gentoo Eclass",
"gerber_image": "Gerber Image",
"gettext_catalog": "Gettext Catalog",
"gf": "Grammatical Framework",
"gherkin": "Gherkin",
"git-ignore": "Ignore List",
"git_attributes": "Git Attributes",
"git_config": "Git Config",
"gitattributes": "Git Attributes",
"gitconfig": "Git Config",
"gitignore": "Ignore List",
"gitmodules": "Git Config",
"glsl": "GLSL",
"glyph": "Glyph",
"glyph_bitmap_distribution_format": "Glyph Bitmap Distribution Format",
"gn": "GN",
"gnuplot": "Gnuplot",
"go": "Go",
@ -237,17 +262,20 @@ var LanguagesByAlias = map[string]string{
"graph_modeling_language": "Graph Modeling Language",
"graphql": "GraphQL",
"graphviz_(dot)": "Graphviz (DOT)",
"groff": "Roff",
"groovy": "Groovy",
"groovy_server_pages": "Groovy Server Pages",
"gsp": "Groovy Server Pages",
"hack": "Hack",
"haml": "Haml",
"handlebars": "Handlebars",
"haproxy": "HAProxy",
"harbour": "Harbour",
"haskell": "Haskell",
"haxe": "Haxe",
"hbs": "Handlebars",
"hcl": "HCL",
"hiveql": "HiveQL",
"hlsl": "HLSL",
"html": "HTML",
"html+django": "HTML+Django",
@ -257,16 +285,20 @@ var LanguagesByAlias = map[string]string{
"html+erb": "HTML+ERB",
"html+jinja": "HTML+Django",
"html+php": "HTML+PHP",
"html+razor": "HTML+Razor",
"html+ruby": "RHTML",
"htmlbars": "Handlebars",
"htmldjango": "HTML+Django",
"http": "HTTP",
"hxml": "HXML",
"hy": "Hy",
"hylang": "Hy",
"hyphy": "HyPhy",
"i7": "Inform 7",
"idl": "IDL",
"idris": "Idris",
"ignore": "Ignore List",
"ignore_list": "Ignore List",
"igor": "IGOR Pro",
"igor_pro": "IGOR Pro",
"igorpro": "IGOR Pro",
@ -286,6 +318,7 @@ var LanguagesByAlias = map[string]string{
"j": "J",
"jasmin": "Jasmin",
"java": "Java",
"java_properties": "Java Properties",
"java_server_page": "Groovy Server Pages",
"java_server_pages": "Java Server Pages",
"javascript": "JavaScript",
@ -297,6 +330,8 @@ var LanguagesByAlias = map[string]string{
"js": "JavaScript",
"json": "JSON",
"json5": "JSON5",
"json_with_comments": "JSON with Comments",
"jsonc": "JSON with Comments",
"jsoniq": "JSONiq",
"jsonld": "JSONLD",
"jsp": "Java Server Pages",
@ -340,6 +375,7 @@ var LanguagesByAlias = map[string]string{
"loomscript": "LoomScript",
"ls": "LiveScript",
"lsl": "LSL",
"ltspice_symbol": "LTspice Symbol",
"lua": "Lua",
"m": "M",
"m4": "M4",
@ -348,17 +384,22 @@ var LanguagesByAlias = map[string]string{
"make": "Makefile",
"makefile": "Makefile",
"mako": "Mako",
"man": "Roff",
"man-page": "Roff",
"man_page": "Roff",
"manpage": "Roff",
"markdown": "Markdown",
"marko": "Marko",
"markojs": "Marko",
"mask": "Mask",
"mathematica": "Mathematica",
"matlab": "Matlab",
"matlab": "MATLAB",
"maven_pom": "Maven POM",
"max": "Max",
"max/msp": "Max",
"maxmsp": "Max",
"maxscript": "MAXScript",
"mdoc": "Roff",
"mediawiki": "MediaWiki",
"mercury": "Mercury",
"meson": "Meson",
@ -369,6 +410,7 @@ var LanguagesByAlias = map[string]string{
"mma": "Mathematica",
"modelica": "Modelica",
"modula-2": "Modula-2",
"modula-3": "Modula-3",
"module_management_system": "Module Management System",
"monkey": "Monkey",
"moocode": "Moocode",
@ -380,6 +422,7 @@ var LanguagesByAlias = map[string]string{
"mumps": "M",
"mupad": "mupad",
"myghty": "Myghty",
"nanorc": "nanorc",
"nasm": "Assembly",
"ncl": "NCL",
"nearley": "Nearley",
@ -389,6 +432,7 @@ var LanguagesByAlias = map[string]string{
"netlinx+erb": "NetLinx+ERB",
"netlogo": "NetLogo",
"newlisp": "NewLisp",
"nextflow": "Nextflow",
"nginx": "Nginx",
"nginx_configuration_file": "Nginx",
"nim": "Nim",
@ -421,8 +465,9 @@ var LanguagesByAlias = map[string]string{
"objectpascal": "Component Pascal",
"objj": "Objective-J",
"ocaml": "OCaml",
"octave": "Matlab",
"octave": "MATLAB",
"omgrofl": "Omgrofl",
"oncrpc": "RPC",
"ooc": "ooc",
"opa": "Opa",
"opal": "Opal",
@ -445,219 +490,270 @@ var LanguagesByAlias = map[string]string{
"parrot": "Parrot",
"parrot_assembly": "Parrot Assembly",
"parrot_internal_representation": "Parrot Internal Representation",
"pascal": "Pascal",
"pasm": "Parrot Assembly",
"pawn": "PAWN",
"pcbnew": "KiCad Layout",
"pep8": "Pep8",
"perl": "Perl",
"perl_6": "Perl 6",
"php": "PHP",
"pic": "Pic",
"pickle": "Pickle",
"picolisp": "PicoLisp",
"piglatin": "PigLatin",
"pike": "Pike",
"pir": "Parrot Internal Representation",
"plpgsql": "PLpgSQL",
"plsql": "PLSQL",
"pod": "Pod",
"pogoscript": "PogoScript",
"pony": "Pony",
"posh": "PowerShell",
"postscr": "PostScript",
"postscript": "PostScript",
"pot": "Gettext Catalog",
"pov-ray": "POV-Ray SDL",
"pov-ray_sdl": "POV-Ray SDL",
"povray": "POV-Ray SDL",
"powerbuilder": "PowerBuilder",
"powershell": "PowerShell",
"processing": "Processing",
"progress": "OpenEdge ABL",
"prolog": "Prolog",
"propeller_spin": "Propeller Spin",
"protobuf": "Protocol Buffer",
"protocol_buffer": "Protocol Buffer",
"protocol_buffers": "Protocol Buffer",
"public_key": "Public Key",
"pug": "Pug",
"puppet": "Puppet",
"pure_data": "Pure Data",
"purebasic": "PureBasic",
"purescript": "PureScript",
"pycon": "Python console",
"pyrex": "Cython",
"python": "Python",
"python_console": "Python console",
"python_traceback": "Python traceback",
"qmake": "QMake",
"qml": "QML",
"r": "R",
"racket": "Racket",
"ragel": "Ragel",
"ragel-rb": "Ragel",
"ragel-ruby": "Ragel",
"rake": "Ruby",
"raml": "RAML",
"rascal": "Rascal",
"raw": "Raw token data",
"raw_token_data": "Raw token data",
"rb": "Ruby",
"rbx": "Ruby",
"rdoc": "RDoc",
"realbasic": "REALbasic",
"reason": "Reason",
"rebol": "Rebol",
"red": "Red",
"red/system": "Red",
"redcode": "Redcode",
"regex": "Regular Expression",
"regexp": "Regular Expression",
"regular_expression": "Regular Expression",
"ren'py": "Ren'Py",
"renderscript": "RenderScript",
"renpy": "Ren'Py",
"restructuredtext": "reStructuredText",
"rexx": "REXX",
"rhtml": "RHTML",
"ring": "Ring",
"rmarkdown": "RMarkdown",
"robotframework": "RobotFramework",
"roff": "Roff",
"rouge": "Rouge",
"rpm_spec": "RPM Spec",
"rs-274x": "Gerber Image",
"rscript": "R",
"rss": "XML",
"rst": "reStructuredText",
"ruby": "Ruby",
"runoff": "RUNOFF",
"rust": "Rust",
"rusthon": "Python",
"sage": "Sage",
"salt": "SaltStack",
"saltstack": "SaltStack",
"saltstate": "SaltStack",
"sas": "SAS",
"sass": "Sass",
"scala": "Scala",
"scaml": "Scaml",
"scheme": "Scheme",
"scilab": "Scilab",
"scss": "SCSS",
"self": "Self",
"sh": "Shell",
"shaderlab": "ShaderLab",
"shell": "Shell",
"shell-script": "Shell",
"shellsession": "ShellSession",
"shen": "Shen",
"slash": "Slash",
"slim": "Slim",
"smali": "Smali",
"smalltalk": "Smalltalk",
"smarty": "Smarty",
"sml": "Standard ML",
"smt": "SMT",
"sourcemod": "SourcePawn",
"sourcepawn": "SourcePawn",
"sparql": "SPARQL",
"specfile": "RPM Spec",
"spline_font_database": "Spline Font Database",
"splus": "R",
"sqf": "SQF",
"sql": "SQL",
"sqlpl": "SQLPL",
"squeak": "Smalltalk",
"squirrel": "Squirrel",
"srecode_template": "SRecode Template",
"stan": "Stan",
"standard_ml": "Standard ML",
"stata": "Stata",
"ston": "STON",
"stylus": "Stylus",
"sublime_text_config": "Sublime Text Config",
"subrip_text": "SubRip Text",
"supercollider": "SuperCollider",
"svg": "SVG",
"swift": "Swift",
"systemverilog": "SystemVerilog",
"tcl": "Tcl",
"tcsh": "Tcsh",
"tea": "Tea",
"terra": "Terra",
"tex": "TeX",
"text": "Text",
"textile": "Textile",
"thrift": "Thrift",
"ti_program": "TI Program",
"tl": "Type Language",
"tla": "TLA",
"toml": "TOML",
"ts": "TypeScript",
"turing": "Turing",
"turtle": "Turtle",
"twig": "Twig",
"txl": "TXL",
"type_language": "Type Language",
"typescript": "TypeScript",
"udiff": "Diff",
"unified_parallel_c": "Unified Parallel C",
"unity3d_asset": "Unity3D Asset",
"unix_assembly": "Unix Assembly",
"uno": "Uno",
"unrealscript": "UnrealScript",
"ur": "UrWeb",
"ur/web": "UrWeb",
"urweb": "UrWeb",
"vala": "Vala",
"vb.net": "Visual Basic",
"vbnet": "Visual Basic",
"vcl": "VCL",
"verilog": "Verilog",
"vhdl": "VHDL",
"vim": "Vim script",
"vim_script": "Vim script",
"viml": "Vim script",
"visual_basic": "Visual Basic",
"volt": "Volt",
"vue": "Vue",
"wasm": "WebAssembly",
"wast": "WebAssembly",
"wavefront_material": "Wavefront Material",
"wavefront_object": "Wavefront Object",
"web_ontology_language": "Web Ontology Language",
"webassembly": "WebAssembly",
"webidl": "WebIDL",
"winbatch": "Batchfile",
"wisp": "wisp",
"pascal": "Pascal",
"pasm": "Parrot Assembly",
"pawn": "Pawn",
"pcbnew": "KiCad Layout",
"pep8": "Pep8",
"perl": "Perl",
"perl6": "Perl 6",
"perl_6": "Perl 6",
"php": "PHP",
"pic": "Pic",
"pickle": "Pickle",
"picolisp": "PicoLisp",
"piglatin": "PigLatin",
"pike": "Pike",
"pir": "Parrot Internal Representation",
"plpgsql": "PLpgSQL",
"plsql": "PLSQL",
"pod": "Pod",
"pod_6": "Pod 6",
"pogoscript": "PogoScript",
"pony": "Pony",
"posh": "PowerShell",
"postcss": "PostCSS",
"postscr": "PostScript",
"postscript": "PostScript",
"pot": "Gettext Catalog",
"pov-ray": "POV-Ray SDL",
"pov-ray_sdl": "POV-Ray SDL",
"povray": "POV-Ray SDL",
"powerbuilder": "PowerBuilder",
"powershell": "PowerShell",
"processing": "Processing",
"progress": "OpenEdge ABL",
"prolog": "Prolog",
"propeller_spin": "Propeller Spin",
"protobuf": "Protocol Buffer",
"protocol_buffer": "Protocol Buffer",
"protocol_buffers": "Protocol Buffer",
"public_key": "Public Key",
"pug": "Pug",
"puppet": "Puppet",
"pure_data": "Pure Data",
"purebasic": "PureBasic",
"purescript": "PureScript",
"pwsh": "PowerShell",
"pycon": "Python console",
"pyrex": "Cython",
"python": "Python",
"python3": "Python",
"python_console": "Python console",
"python_traceback": "Python traceback",
"q": "q",
"qmake": "QMake",
"qml": "QML",
"quake": "Quake",
"r": "R",
"racket": "Racket",
"ragel": "Ragel",
"ragel-rb": "Ragel",
"ragel-ruby": "Ragel",
"rake": "Ruby",
"raml": "RAML",
"rascal": "Rascal",
"raw": "Raw token data",
"raw_token_data": "Raw token data",
"razor": "HTML+Razor",
"rb": "Ruby",
"rbx": "Ruby",
"rdoc": "RDoc",
"realbasic": "REALbasic",
"reason": "Reason",
"rebol": "Rebol",
"red": "Red",
"red/system": "Red",
"redcode": "Redcode",
"regex": "Regular Expression",
"regexp": "Regular Expression",
"regular_expression": "Regular Expression",
"ren'py": "Ren'Py",
"renderscript": "RenderScript",
"renpy": "Ren'Py",
"restructuredtext": "reStructuredText",
"rexx": "REXX",
"rhtml": "RHTML",
"ring": "Ring",
"rmarkdown": "RMarkdown",
"robotframework": "RobotFramework",
"roff": "Roff",
"roff_manpage": "Roff Manpage",
"rouge": "Rouge",
"rpc": "RPC",
"rpcgen": "RPC",
"rpm_spec": "RPM Spec",
"rs-274x": "Gerber Image",
"rscript": "R",
"rss": "XML",
"rst": "reStructuredText",
"ruby": "Ruby",
"runoff": "RUNOFF",
"rust": "Rust",
"rusthon": "Python",
"sage": "Sage",
"salt": "SaltStack",
"saltstack": "SaltStack",
"saltstate": "SaltStack",
"sas": "SAS",
"sass": "Sass",
"scala": "Scala",
"scaml": "Scaml",
"scheme": "Scheme",
"scilab": "Scilab",
"scss": "SCSS",
"sed": "sed",
"self": "Self",
"sh": "Shell",
"shaderlab": "ShaderLab",
"shell": "Shell",
"shell-script": "Shell",
"shellsession": "ShellSession",
"shen": "Shen",
"slash": "Slash",
"slice": "Slice",
"slim": "Slim",
"smali": "Smali",
"smalltalk": "Smalltalk",
"smarty": "Smarty",
"sml": "Standard ML",
"smt": "SMT",
"snippet": "YASnippet",
"solidity": "Solidity",
"sourcemod": "SourcePawn",
"sourcepawn": "SourcePawn",
"soy": "Closure Templates",
"sparql": "SPARQL",
"specfile": "RPM Spec",
"spline_font_database": "Spline Font Database",
"splus": "R",
"sqf": "SQF",
"sql": "SQL",
"sqlpl": "SQLPL",
"squeak": "Smalltalk",
"squirrel": "Squirrel",
"srecode_template": "SRecode Template",
"stan": "Stan",
"standard_ml": "Standard ML",
"stata": "Stata",
"ston": "STON",
"stylus": "Stylus",
"subrip_text": "SubRip Text",
"sugarss": "SugarSS",
"supercollider": "SuperCollider",
"svg": "SVG",
"swift": "Swift",
"systemverilog": "SystemVerilog",
"tcl": "Tcl",
"tcsh": "Tcsh",
"tea": "Tea",
"terra": "Terra",
"terraform": "HCL",
"tex": "TeX",
"text": "Text",
"textile": "Textile",
"thrift": "Thrift",
"ti_program": "TI Program",
"tl": "Type Language",
"tla": "TLA",
"toml": "TOML",
"troff": "Roff",
"ts": "TypeScript",
"turing": "Turing",
"turtle": "Turtle",
"twig": "Twig",
"txl": "TXL",
"type_language": "Type Language",
"typescript": "TypeScript",
"udiff": "Diff",
"unified_parallel_c": "Unified Parallel C",
"unity3d_asset": "Unity3D Asset",
"unix_assembly": "Unix Assembly",
"uno": "Uno",
"unrealscript": "UnrealScript",
"ur": "UrWeb",
"ur/web": "UrWeb",
"urweb": "UrWeb",
"vala": "Vala",
"vb.net": "Visual Basic",
"vbnet": "Visual Basic",
"vcl": "VCL",
"verilog": "Verilog",
"vhdl": "VHDL",
"vim": "Vim script",
"vim_script": "Vim script",
"viml": "Vim script",
"visual_basic": "Visual Basic",
"volt": "Volt",
"vue": "Vue",
"wasm": "WebAssembly",
"wast": "WebAssembly",
"wavefront_material": "Wavefront Material",
"wavefront_object": "Wavefront Object",
"wdl": "wdl",
"web_ontology_language": "Web Ontology Language",
"webassembly": "WebAssembly",
"webidl": "WebIDL",
"winbatch": "Batchfile",
"windows_registry_entries": "Windows Registry Entries",
"wisp": "wisp",
"world_of_warcraft_addon_data": "World of Warcraft Addon Data",
"wsdl": "XML",
"x10": "X10",
"xbase": "xBase",
"xc": "XC",
"xcompose": "XCompose",
"xhtml": "HTML",
"xml": "XML",
"xml+genshi": "Genshi",
"xml+kid": "Genshi",
"xojo": "Xojo",
"xpages": "XPages",
"xpm": "XPM",
"xproc": "XProc",
"xquery": "XQuery",
"xs": "XS",
"xsd": "XML",
"xsl": "XSLT",
"xslt": "XSLT",
"xten": "X10",
"xtend": "Xtend",
"yacc": "Yacc",
"yaml": "YAML",
"yang": "YANG",
"yml": "YAML",
"zephir": "Zephir",
"zimpl": "Zimpl",
"zsh": "Shell",
"wsdl": "XML",
"x10": "X10",
"x_bitmap": "X BitMap",
"x_font_directory_index": "X Font Directory Index",
"x_pixmap": "X PixMap",
"xbase": "xBase",
"xbm": "X BitMap",
"xc": "XC",
"xcompose": "XCompose",
"xdr": "RPC",
"xhtml": "HTML",
"xml": "XML",
"xml+genshi": "Genshi",
"xml+kid": "Genshi",
"xojo": "Xojo",
"xpages": "XPages",
"xpm": "X PixMap",
"xproc": "XProc",
"xquery": "XQuery",
"xs": "XS",
"xsd": "XML",
"xsl": "XSLT",
"xslt": "XSLT",
"xten": "X10",
"xtend": "Xtend",
"yacc": "Yacc",
"yaml": "YAML",
"yang": "YANG",
"yara": "YARA",
"yas": "YASnippet",
"yasnippet": "YASnippet",
"yml": "YAML",
"zephir": "Zephir",
"zig": "Zig",
"zimpl": "Zimpl",
"zsh": "Shell",
}
// LanguageByAlias looks up the language name by it's alias or name.
// It mirrors the logic of github linguist and is needed e.g for heuristcs.yml
// that mixes names and aliases in a language field (see XPM example).
func LanguageByAlias(langOrAlias string) (lang string, ok bool) {
k := convertToAliasKey(langOrAlias)
lang, ok = LanguageByAliasMap[k]
return
}
// convertToAliasKey converts language name to a key in LanguageByAliasMap.
// Following
// - internal.code-generator.generator.convertToAliasKey()
// - GetLanguageByAlias()
// conventions.
// It is here to avoid dependency on "generate" and "enry" packages.
func convertToAliasKey(langName string) string {
ak := strings.SplitN(langName, `,`, 2)[0]
ak = strings.Replace(ak, ` `, `_`, -1)
ak = strings.ToLower(ak)
return ak
}

View File

@ -1,7 +1,7 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
// linguist's commit from which files were generated.
var LinguistCommit = "4cd558c37482e8d2c535d8107f2d11b49afbc5b5"
var LinguistCommit = "e4560984058b4726010ca4b8f03ed9d0f8f464db"

File diff suppressed because it is too large Load Diff

3
data/doc.go Normal file
View File

@ -0,0 +1,3 @@
// Package data contains only auto-generated data-structures for all the language
// identification strategies from the Linguist project sources.
package data

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -8,10 +8,12 @@ import "gopkg.in/toqueteos/substring.v1"
var DocumentationMatchers = substring.Or(
substring.Regexp(`^[Dd]ocs?/`),
substring.Regexp(`(^|/)[Dd]ocumentation/`),
substring.Regexp(`(^|/)[Gg]roovydoc/`),
substring.Regexp(`(^|/)[Jj]avadoc/`),
substring.Regexp(`^[Mm]an/`),
substring.Regexp(`^[Ee]xamples/`),
substring.Regexp(`^[Dd]emos?/`),
substring.Regexp(`(^|/)inst/doc/`),
substring.Regexp(`(^|/)CHANGE(S|LOG)?(\.|$)`),
substring.Regexp(`(^|/)CONTRIBUTING(\.|$)`),
substring.Regexp(`(^|/)COPYING(\.|$)`),

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -8,41 +8,65 @@ var LanguagesByFilename = map[string][]string{
".XCompose": {"XCompose"},
".abbrev_defs": {"Emacs Lisp"},
".arcconfig": {"JSON"},
".babelrc": {"JSON5"},
".atomignore": {"Ignore List"},
".babelignore": {"Ignore List"},
".babelrc": {"JSON with Comments"},
".bash_aliases": {"Shell"},
".bash_history": {"Shell"},
".bash_logout": {"Shell"},
".bash_profile": {"Shell"},
".bashrc": {"Shell"},
".bzrignore": {"Ignore List"},
".clang-format": {"YAML"},
".clang-tidy": {"YAML"},
".classpath": {"XML"},
".coffeelintignore": {"Ignore List"},
".cproject": {"XML"},
".cshrc": {"Shell"},
".cvsignore": {"Ignore List"},
".dockerignore": {"Ignore List"},
".editorconfig": {"INI"},
".emacs": {"Emacs Lisp"},
".emacs.desktop": {"Emacs Lisp"},
".eslintignore": {"Ignore List"},
".eslintrc.json": {"JSON with Comments"},
".factor-boot-rc": {"Factor"},
".factor-rc": {"Factor"},
".gclient": {"Python"},
".gemrc": {"YAML"},
".gitconfig": {"INI"},
".gitattributes": {"Git Attributes"},
".gitconfig": {"Git Config"},
".gitignore": {"Ignore List"},
".gitmodules": {"Git Config"},
".gn": {"GN"},
".gnus": {"Emacs Lisp"},
".gvimrc": {"Vim script"},
".htaccess": {"ApacheConf"},
".htmlhintrc": {"JSON"},
".irbrc": {"Ruby"},
".jshintrc": {"JSON"},
".jscsrc": {"JSON with Comments"},
".jshintrc": {"JSON with Comments"},
".jslintrc": {"JSON with Comments"},
".login": {"Shell"},
".nanorc": {"nanorc"},
".nodemonignore": {"Ignore List"},
".npmignore": {"Ignore List"},
".nvimrc": {"Vim script"},
".php": {"PHP"},
".php_cs": {"PHP"},
".php_cs.dist": {"PHP"},
".prettierignore": {"Ignore List"},
".profile": {"Shell"},
".project": {"XML"},
".pryrc": {"Ruby"},
".spacemacs": {"Emacs Lisp"},
".stylelintignore": {"Ignore List"},
".tern-config": {"JSON"},
".tern-project": {"JSON"},
".vimrc": {"Vim script"},
".viper": {"Emacs Lisp"},
".vscodeignore": {"Ignore List"},
".watchmanconfig": {"JSON"},
".zlogin": {"Shell"},
".zlogout": {"Shell"},
".zprofile": {"Shell"},
@ -55,6 +79,7 @@ var LanguagesByFilename = map[string][]string{
"BSDmakefile": {"Makefile"},
"BUCK": {"Python"},
"BUILD": {"Python"},
"BUILD.bazel": {"Python"},
"Berksfile": {"Ruby"},
"Brewfile": {"Ruby"},
"Buildfile": {"Ruby"},
@ -64,6 +89,7 @@ var LanguagesByFilename = map[string][]string{
"COPYRIGHT.regex": {"Text"},
"Cakefile": {"CoffeeScript"},
"Capfile": {"Ruby"},
"Cargo.lock": {"TOML"},
"Cask": {"Emacs Lisp"},
"Dangerfile": {"Ruby"},
"Deliverfile": {"Ruby"},
@ -75,6 +101,7 @@ var LanguagesByFilename = map[string][]string{
"GNUmakefile": {"Makefile"},
"Gemfile": {"Ruby"},
"Gemfile.lock": {"Ruby"},
"Gopkg.lock": {"TOML"},
"Guardfile": {"Ruby"},
"INSTALL": {"Text"},
"INSTALL.mysql": {"Text"},
@ -85,6 +112,7 @@ var LanguagesByFilename = map[string][]string{
"LICENSE": {"Text"},
"LICENSE.mysql": {"Text"},
"Makefile": {"Makefile"},
"Makefile.PL": {"Perl"},
"Makefile.am": {"Makefile"},
"Makefile.boot": {"Makefile"},
"Makefile.frag": {"Makefile"},
@ -107,7 +135,7 @@ var LanguagesByFilename = map[string][]string{
"README.mysql": {"Text"},
"ROOT": {"Isabelle ROOT"},
"Rakefile": {"Ruby"},
"Rexfile": {"Perl 6"},
"Rexfile": {"Perl"},
"SConscript": {"Python"},
"SConstruct": {"Python"},
"Settings.StyleCop": {"XML"},
@ -127,6 +155,7 @@ var LanguagesByFilename = map[string][]string{
"ack": {"Perl"},
"ant.xml": {"Ant Build System"},
"apache2.conf": {"ApacheConf"},
"bash_aliases": {"Shell"},
"bash_logout": {"Shell"},
"bash_profile": {"Shell"},
"bashrc": {"Shell"},
@ -136,19 +165,34 @@ var LanguagesByFilename = map[string][]string{
"click.me": {"Text"},
"composer.lock": {"JSON"},
"configure.ac": {"M4Sugar"},
"contents.lr": {"Markdown"},
"cpanfile": {"Perl"},
"cshrc": {"Shell"},
"delete.me": {"Text"},
"descrip.mmk": {"Module Management System"},
"descrip.mms": {"Module Management System"},
"encodings.dir": {"X Font Directory Index"},
"expr-dist": {"R"},
"firestore.rules": {"Cloud Firestore Security Rules"},
"fonts.alias": {"X Font Directory Index"},
"fonts.dir": {"X Font Directory Index"},
"fonts.scale": {"X Font Directory Index"},
"fp-lib-table": {"KiCad Layout"},
"gitignore-global": {"Ignore List"},
"gitignore_global": {"Ignore List"},
"glide.lock": {"YAML"},
"go.mod": {"Text"},
"go.sum": {"Text"},
"gradlew": {"Shell"},
"gvimrc": {"Vim script"},
"haproxy.cfg": {"HAProxy"},
"httpd.conf": {"ApacheConf"},
"jsconfig.json": {"JSON with Comments"},
"keep.me": {"Text"},
"ld.script": {"Linker Script"},
"login": {"Shell"},
"m3makefile": {"Quake"},
"m3overrides": {"Quake"},
"makefile": {"Makefile"},
"makefile.sco": {"Makefile"},
"man": {"Shell"},
@ -159,7 +203,10 @@ var LanguagesByFilename = map[string][]string{
"mkfile": {"Makefile"},
"mmn": {"Roff"},
"mmt": {"Roff"},
"nanorc": {"nanorc"},
"nextflow.config": {"Nextflow"},
"nginx.conf": {"Nginx"},
"nim.cfg": {"Nim"},
"nvimrc": {"Vim script"},
"owh": {"Tcl"},
"packages.config": {"XML"},
@ -171,9 +218,9 @@ var LanguagesByFilename = map[string][]string{
"rebar.config.lock": {"Erlang"},
"rebar.lock": {"Erlang"},
"riemann.config": {"Clojure"},
"script": {"C"},
"starfield": {"Tcl"},
"test.me": {"Text"},
"tsconfig.json": {"JSON with Comments"},
"vimrc": {"Vim script"},
"wscript": {"Python"},
"xcompose": {"XCompose"},

File diff suppressed because it is too large Load Diff

35
data/heuristics.go Normal file
View File

@ -0,0 +1,35 @@
package data
import "gopkg.in/src-d/enry.v1/data/rule"
// Heuristics implements a rule-based content matching engine.
// Heuristics is a number of sequntially applied rule.Heuristic where a
// matching one disambiguages language(s) for a single file extension.
type Heuristics []rule.Heuristic
// Match returns languages identified by the matching rule of the heuristic.
func (hs Heuristics) Match(data []byte) []string {
var matchedLangs []string
for _, heuristic := range hs {
if heuristic.Match(data) {
for _, langOrAlias := range heuristic.Languages() {
lang, ok := LanguageByAlias(langOrAlias)
if !ok { // should never happen
// reaching here means language name/alias in heuristics.yml
// is not consistent with languages.yml
// but we do not surface any such error at the API
continue
}
matchedLangs = append(matchedLangs, lang)
}
break
}
}
return matchedLangs
}
// matchString is a convenience used only in tests.
func (hs *Heuristics) matchString(data string) []string {
return hs.Match([]byte(data))
}

61
data/heuristics_test.go Normal file
View File

@ -0,0 +1,61 @@
package data
import (
"regexp"
"testing"
"github.com/stretchr/testify/assert"
"gopkg.in/src-d/enry.v1/data/rule"
)
var testContentHeuristics = map[string]*Heuristics{
".md": &Heuristics{ // final pattern for parsed YAML rule
rule.Or(
rule.MatchingLanguages("Markdown"),
regexp.MustCompile(`(^[-A-Za-z0-9=#!\*\[|>])|<\/ | \A\z`),
),
rule.Or(
rule.MatchingLanguages("GCC Machine Description"),
regexp.MustCompile(`^(;;|\(define_)`),
),
rule.Always(
rule.MatchingLanguages("Markdown"),
),
},
".ms": &Heuristics{
// Order defines precedence: And, Or, Not, Named, Always
rule.And(
rule.MatchingLanguages("Unix Assembly"),
rule.Not(rule.MatchingLanguages(""), regexp.MustCompile(`/\*`)),
rule.Or(
rule.MatchingLanguages(""),
regexp.MustCompile(`^\s*\.(?:include\s|globa?l\s|[A-Za-z][_A-Za-z0-9]*:)`),
),
),
rule.Or(
rule.MatchingLanguages("Roff"),
regexp.MustCompile(`^[.''][A-Za-z]{2}(\s|$)`),
),
rule.Always(
rule.MatchingLanguages("MAXScript"),
),
},
}
func TestContentHeuristic_MatchingAlways(t *testing.T) {
lang := testContentHeuristics[".md"].matchString("")
assert.Equal(t, []string{"Markdown"}, lang)
lang = testContentHeuristics[".ms"].matchString("")
assert.Equal(t, []string{"MAXScript"}, lang)
}
func TestContentHeuristic_MatchingAnd(t *testing.T) {
lang := testContentHeuristics[".md"].matchString(";;")
assert.Equal(t, []string{"GCC Machine Description"}, lang)
}
func TestContentHeuristic_MatchingOr(t *testing.T) {
lang := testContentHeuristics[".ms"].matchString(" .include \"math.s\"")
assert.Equal(t, []string{"Unix Assembly"}, lang)
}

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -8,6 +8,7 @@ var LanguagesByInterpreter = map[string][]string{
"apl": {"APL"},
"aplx": {"APL"},
"ash": {"Shell"},
"asy": {"Asymptote"},
"awk": {"Awk"},
"bash": {"Shell"},
"bigloo": {"Scheme"},
@ -16,9 +17,11 @@ var LanguagesByInterpreter = map[string][]string{
"chicken": {"Scheme"},
"clisp": {"Common Lisp"},
"coffee": {"CoffeeScript"},
"cperl": {"Perl"},
"crystal": {"Crystal"},
"csi": {"Scheme"},
"cvc4": {"SMT"},
"cwl-runner": {"Common Workflow Language"},
"dart": {"Dart"},
"dash": {"Shell"},
"dtrace": {"DTrace"},
@ -34,7 +37,9 @@ var LanguagesByInterpreter = map[string][]string{
"gnuplot": {"Gnuplot"},
"gosh": {"Scheme"},
"groovy": {"Groovy"},
"gsed": {"sed"},
"guile": {"Scheme"},
"hy": {"Hy"},
"instantfpc": {"Pascal"},
"io": {"Io"},
"ioke": {"Ioke"},
@ -50,12 +55,14 @@ var LanguagesByInterpreter = map[string][]string{
"make": {"Makefile"},
"mathsat5": {"SMT"},
"mawk": {"Awk"},
"minised": {"sed"},
"mksh": {"Shell"},
"mmi": {"Mercury"},
"moon": {"MoonScript"},
"nawk": {"Awk"},
"newlisp": {"NewLisp"},
"node": {"JavaScript"},
"nextflow": {"Nextflow"},
"node": {"JavaScript", "TypeScript"},
"nush": {"Nu"},
"ocaml": {"OCaml", "Reason"},
"ocamlrun": {"OCaml"},
@ -66,11 +73,12 @@ var LanguagesByInterpreter = map[string][]string{
"parrot": {"Parrot Assembly", "Parrot Internal Representation"},
"pdksh": {"Shell"},
"perl": {"Perl", "Pod"},
"perl6": {"Perl 6"},
"perl6": {"Perl 6", "Pod 6"},
"php": {"PHP"},
"picolisp": {"PicoLisp"},
"pike": {"Pike"},
"pil": {"PicoLisp"},
"pwsh": {"PowerShell"},
"python": {"Python"},
"python2": {"Python"},
"python3": {"Python"},
@ -87,11 +95,14 @@ var LanguagesByInterpreter = map[string][]string{
"runhaskell": {"Haskell"},
"sbcl": {"Common Lisp"},
"scala": {"Scala"},
"scheme": {"Scheme"},
"sclang": {"SuperCollider"},
"scsynth": {"SuperCollider"},
"sed": {"sed"},
"sh": {"Shell"},
"smt-rat": {"SMT"},
"smtinterpol": {"SMT"},
"ssed": {"sed"},
"stp": {"SMT"},
"swipl": {"Prolog"},
"tcc": {"C"},

View File

@ -1,204 +1,219 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
var LanguagesMime = map[string]string{
"AGS Script": "text/x-c++src",
"APL": "text/apl",
"ASN.1": "text/x-ttcn-asn",
"ASP": "application/x-aspx",
"Alpine Abuild": "text/x-sh",
"Ant Build System": "application/xml",
"Apex": "text/x-java",
"Arduino": "text/x-c++src",
"Brainfuck": "text/x-brainfuck",
"C": "text/x-csrc",
"C#": "text/x-csharp",
"C++": "text/x-c++src",
"C2hs Haskell": "text/x-haskell",
"CMake": "text/x-cmake",
"COBOL": "text/x-cobol",
"COLLADA": "text/xml",
"CSON": "text/x-coffeescript",
"CSS": "text/css",
"ChucK": "text/x-java",
"Clojure": "text/x-clojure",
"Closure Templates": "text/x-soy",
"CoffeeScript": "text/x-coffeescript",
"Common Lisp": "text/x-common-lisp",
"Component Pascal": "text/x-pascal",
"Crystal": "text/x-crystal",
"Cuda": "text/x-c++src",
"Cycript": "text/javascript",
"Cython": "text/x-cython",
"D": "text/x-d",
"DTrace": "text/x-csrc",
"Dart": "application/dart",
"Diff": "text/x-diff",
"Dockerfile": "text/x-dockerfile",
"Dylan": "text/x-dylan",
"EBNF": "text/x-ebnf",
"ECL": "text/x-ecl",
"EQ": "text/x-csharp",
"Eagle": "text/xml",
"Easybuild": "text/x-python",
"Ecere Projects": "application/json",
"Eiffel": "text/x-eiffel",
"Elm": "text/x-elm",
"Emacs Lisp": "text/x-common-lisp",
"EmberScript": "text/x-coffeescript",
"Erlang": "text/x-erlang",
"F#": "text/x-fsharp",
"Factor": "text/x-factor",
"Forth": "text/x-forth",
"Fortran": "text/x-fortran",
"GCC Machine Description": "text/x-common-lisp",
"AGS Script": "text/x-c++src",
"APL": "text/apl",
"ASN.1": "text/x-ttcn-asn",
"ASP": "application/x-aspx",
"Alpine Abuild": "text/x-sh",
"AngelScript": "text/x-c++src",
"Ant Build System": "application/xml",
"Apex": "text/x-java",
"Asymptote": "text/x-kotlin",
"Brainfuck": "text/x-brainfuck",
"C": "text/x-csrc",
"C#": "text/x-csharp",
"C++": "text/x-c++src",
"C2hs Haskell": "text/x-haskell",
"CMake": "text/x-cmake",
"COBOL": "text/x-cobol",
"COLLADA": "text/xml",
"CSON": "text/x-coffeescript",
"CSS": "text/css",
"ChucK": "text/x-java",
"Clojure": "text/x-clojure",
"Closure Templates": "text/x-soy",
"Cloud Firestore Security Rules": "text/css",
"CoffeeScript": "text/x-coffeescript",
"Common Lisp": "text/x-common-lisp",
"Common Workflow Language": "text/x-yaml",
"Component Pascal": "text/x-pascal",
"Crystal": "text/x-crystal",
"Cuda": "text/x-c++src",
"Cycript": "text/javascript",
"Cython": "text/x-cython",
"D": "text/x-d",
"DTrace": "text/x-csrc",
"Dart": "application/dart",
"Diff": "text/x-diff",
"Dockerfile": "text/x-dockerfile",
"Dylan": "text/x-dylan",
"EBNF": "text/x-ebnf",
"ECL": "text/x-ecl",
"EQ": "text/x-csharp",
"Eagle": "text/xml",
"Easybuild": "text/x-python",
"Ecere Projects": "application/json",
"Edje Data Collection": "application/json",
"Eiffel": "text/x-eiffel",
"Elm": "text/x-elm",
"Emacs Lisp": "text/x-common-lisp",
"EmberScript": "text/x-coffeescript",
"Erlang": "text/x-erlang",
"F#": "text/x-fsharp",
"Factor": "text/x-factor",
"Forth": "text/x-forth",
"Fortran": "text/x-fortran",
"GCC Machine Description": "text/x-common-lisp",
"GN": "text/x-python",
"Game Maker Language": "text/x-c++src",
"Genshi": "text/xml",
"Gentoo Ebuild": "text/x-sh",
"Gentoo Eclass": "text/x-sh",
"Git Attributes": "text/x-sh",
"Git Config": "text/x-properties",
"Glyph": "text/x-tcl",
"Go": "text/x-go",
"Grammatical Framework": "text/x-haskell",
"Groovy": "text/x-groovy",
"Groovy Server Pages": "application/x-jsp",
"HCL": "text/x-ruby",
"HTML": "text/html",
"HTML+Django": "text/x-django",
"HTML+ECR": "text/html",
"HTML+EEX": "text/html",
"HTML+ERB": "application/x-erb",
"HTML+PHP": "application/x-httpd-php",
"HTTP": "message/http",
"Hack": "application/x-httpd-php",
"Haml": "text/x-haml",
"Haskell": "text/x-haskell",
"Haxe": "text/x-haxe",
"IDL": "text/x-idl",
"INI": "text/x-properties",
"IRC log": "text/mirc",
"JSON": "application/json",
"JSON5": "application/json",
"JSONiq": "application/json",
"JSX": "text/jsx",
"Java": "text/x-java",
"Java Server Pages": "application/x-jsp",
"JavaScript": "text/javascript",
"Julia": "text/x-julia",
"Jupyter Notebook": "application/json",
"KiCad Layout": "text/x-common-lisp",
"Kit": "text/html",
"Kotlin": "text/x-kotlin",
"LFE": "text/x-common-lisp",
"LabVIEW": "text/xml",
"Latte": "text/x-smarty",
"Less": "text/css",
"Literate Haskell": "text/x-literate-haskell",
"LiveScript": "text/x-livescript",
"LookML": "text/x-yaml",
"Lua": "text/x-lua",
"M": "text/x-mumps",
"MTML": "text/html",
"MUF": "text/x-forth",
"Makefile": "text/x-cmake",
"Markdown": "text/x-gfm",
"Marko": "text/html",
"Mathematica": "text/x-mathematica",
"Matlab": "text/x-octave",
"Maven POM": "text/xml",
"Max": "application/json",
"Metal": "text/x-c++src",
"Mirah": "text/x-ruby",
"Modelica": "text/x-modelica",
"NSIS": "text/x-nsis",
"NetLogo": "text/x-common-lisp",
"NewLisp": "text/x-common-lisp",
"Nginx": "text/x-nginx-conf",
"Nu": "text/x-scheme",
"NumPy": "text/x-python",
"OCaml": "text/x-ocaml",
"Objective-C": "text/x-objectivec",
"Objective-C++": "text/x-objectivec",
"OpenCL": "text/x-csrc",
"OpenRC runscript": "text/x-sh",
"Oz": "text/x-oz",
"PHP": "application/x-httpd-php",
"PLSQL": "text/x-plsql",
"PLpgSQL": "text/x-sql",
"Pascal": "text/x-pascal",
"Perl": "text/x-perl",
"Perl 6": "text/x-perl",
"Pic": "text/troff",
"Pod": "text/x-perl",
"PowerShell": "application/x-powershell",
"Protocol Buffer": "text/x-protobuf",
"Public Key": "application/pgp",
"Pug": "text/x-pug",
"Puppet": "text/x-puppet",
"PureScript": "text/x-haskell",
"Python": "text/x-python",
"R": "text/x-rsrc",
"RAML": "text/x-yaml",
"RHTML": "application/x-erb",
"RMarkdown": "text/x-gfm",
"RPM Spec": "text/x-rpm-spec",
"Reason": "text/x-rustsrc",
"Roff": "text/troff",
"Rouge": "text/x-clojure",
"Ruby": "text/x-ruby",
"Rust": "text/x-rustsrc",
"SAS": "text/x-sas",
"SCSS": "text/x-scss",
"SPARQL": "application/sparql-query",
"SQL": "text/x-sql",
"SQLPL": "text/x-sql",
"SRecode Template": "text/x-common-lisp",
"SVG": "text/xml",
"Sage": "text/x-python",
"SaltStack": "text/x-yaml",
"Sass": "text/x-sass",
"Scala": "text/x-scala",
"Scheme": "text/x-scheme",
"Shell": "text/x-sh",
"ShellSession": "text/x-sh",
"Slim": "text/x-slim",
"Smalltalk": "text/x-stsrc",
"Smarty": "text/x-smarty",
"Squirrel": "text/x-c++src",
"Standard ML": "text/x-ocaml",
"Sublime Text Config": "text/javascript",
"Swift": "text/x-swift",
"SystemVerilog": "text/x-systemverilog",
"TOML": "text/x-toml",
"Tcl": "text/x-tcl",
"Tcsh": "text/x-sh",
"TeX": "text/x-stex",
"Terra": "text/x-lua",
"Textile": "text/x-textile",
"Turtle": "text/turtle",
"Twig": "text/x-twig",
"TypeScript": "application/typescript",
"Unified Parallel C": "text/x-csrc",
"Unity3D Asset": "text/x-yaml",
"Uno": "text/x-csharp",
"UnrealScript": "text/x-java",
"VHDL": "text/x-vhdl",
"Verilog": "text/x-verilog",
"Visual Basic": "text/x-vb",
"Volt": "text/x-d",
"WebAssembly": "text/x-common-lisp",
"WebIDL": "text/x-webidl",
"XC": "text/x-csrc",
"XML": "text/xml",
"XPages": "text/xml",
"XProc": "text/xml",
"XQuery": "application/xquery",
"XS": "text/x-csrc",
"XSLT": "text/xml",
"YAML": "text/x-yaml",
"edn": "text/x-clojure",
"reStructuredText": "text/x-rst",
"wisp": "text/x-clojure",
"HCL": "text/x-ruby",
"HTML": "text/html",
"HTML+Django": "text/x-django",
"HTML+ECR": "text/html",
"HTML+EEX": "text/html",
"HTML+ERB": "application/x-erb",
"HTML+PHP": "application/x-httpd-php",
"HTML+Razor": "text/html",
"HTTP": "message/http",
"Hack": "application/x-httpd-php",
"Haml": "text/x-haml",
"Haskell": "text/x-haskell",
"Haxe": "text/x-haxe",
"IDL": "text/x-idl",
"INI": "text/x-properties",
"IRC log": "text/mirc",
"Ignore List": "text/x-sh",
"JSON": "application/json",
"JSON with Comments": "text/javascript",
"JSON5": "application/json",
"JSONLD": "application/json",
"JSONiq": "application/json",
"JSX": "text/jsx",
"Java": "text/x-java",
"Java Properties": "text/x-properties",
"Java Server Pages": "application/x-jsp",
"JavaScript": "text/javascript",
"Julia": "text/x-julia",
"Jupyter Notebook": "application/json",
"KiCad Layout": "text/x-common-lisp",
"Kit": "text/html",
"Kotlin": "text/x-kotlin",
"LFE": "text/x-common-lisp",
"LTspice Symbol": "text/x-spreadsheet",
"LabVIEW": "text/xml",
"Latte": "text/x-smarty",
"Less": "text/css",
"Literate Haskell": "text/x-literate-haskell",
"LiveScript": "text/x-livescript",
"LookML": "text/x-yaml",
"Lua": "text/x-lua",
"M": "text/x-mumps",
"MATLAB": "text/x-octave",
"MTML": "text/html",
"MUF": "text/x-forth",
"Makefile": "text/x-cmake",
"Markdown": "text/x-gfm",
"Marko": "text/html",
"Mathematica": "text/x-mathematica",
"Maven POM": "text/xml",
"Max": "application/json",
"Metal": "text/x-c++src",
"Mirah": "text/x-ruby",
"Modelica": "text/x-modelica",
"NSIS": "text/x-nsis",
"NetLogo": "text/x-common-lisp",
"NewLisp": "text/x-common-lisp",
"Nginx": "text/x-nginx-conf",
"Nu": "text/x-scheme",
"NumPy": "text/x-python",
"OCaml": "text/x-ocaml",
"Objective-C": "text/x-objectivec",
"Objective-C++": "text/x-objectivec",
"OpenCL": "text/x-csrc",
"OpenRC runscript": "text/x-sh",
"Oz": "text/x-oz",
"PHP": "application/x-httpd-php",
"PLSQL": "text/x-plsql",
"PLpgSQL": "text/x-sql",
"Pascal": "text/x-pascal",
"Perl": "text/x-perl",
"Perl 6": "text/x-perl",
"Pic": "text/troff",
"Pod": "text/x-perl",
"PowerShell": "application/x-powershell",
"Protocol Buffer": "text/x-protobuf",
"Public Key": "application/pgp",
"Pug": "text/x-pug",
"Puppet": "text/x-puppet",
"PureScript": "text/x-haskell",
"Python": "text/x-python",
"R": "text/x-rsrc",
"RAML": "text/x-yaml",
"RHTML": "application/x-erb",
"RMarkdown": "text/x-gfm",
"RPM Spec": "text/x-rpm-spec",
"Reason": "text/x-rustsrc",
"Roff": "text/troff",
"Roff Manpage": "text/troff",
"Rouge": "text/x-clojure",
"Ruby": "text/x-ruby",
"Rust": "text/x-rustsrc",
"SAS": "text/x-sas",
"SCSS": "text/x-scss",
"SPARQL": "application/sparql-query",
"SQL": "text/x-sql",
"SQLPL": "text/x-sql",
"SRecode Template": "text/x-common-lisp",
"SVG": "text/xml",
"Sage": "text/x-python",
"SaltStack": "text/x-yaml",
"Sass": "text/x-sass",
"Scala": "text/x-scala",
"Scheme": "text/x-scheme",
"Shell": "text/x-sh",
"ShellSession": "text/x-sh",
"Slim": "text/x-slim",
"Smalltalk": "text/x-stsrc",
"Smarty": "text/x-smarty",
"Squirrel": "text/x-c++src",
"Standard ML": "text/x-ocaml",
"Swift": "text/x-swift",
"SystemVerilog": "text/x-systemverilog",
"TOML": "text/x-toml",
"Tcl": "text/x-tcl",
"Tcsh": "text/x-sh",
"TeX": "text/x-stex",
"Terra": "text/x-lua",
"Textile": "text/x-textile",
"Turtle": "text/turtle",
"Twig": "text/x-twig",
"TypeScript": "application/typescript",
"Unified Parallel C": "text/x-csrc",
"Unity3D Asset": "text/x-yaml",
"Uno": "text/x-csharp",
"UnrealScript": "text/x-java",
"VHDL": "text/x-vhdl",
"Verilog": "text/x-verilog",
"Visual Basic": "text/x-vb",
"Volt": "text/x-d",
"WebAssembly": "text/x-common-lisp",
"WebIDL": "text/x-webidl",
"Windows Registry Entries": "text/x-properties",
"X BitMap": "text/x-csrc",
"X PixMap": "text/x-csrc",
"XC": "text/x-csrc",
"XML": "text/xml",
"XPages": "text/xml",
"XProc": "text/xml",
"XQuery": "application/xquery",
"XS": "text/x-csrc",
"XSLT": "text/xml",
"YAML": "text/x-yaml",
"edn": "text/x-clojure",
"reStructuredText": "text/x-rst",
"wisp": "text/x-clojure",
}

109
data/rule/rule.go Normal file
View File

@ -0,0 +1,109 @@
// Package rule contains rule-based heuristic implementations.
// It is used in the generated code in content.go for disambiguation of languages
// with colliding extensions, based on regexps from Linguist data.
package rule
// Heuristic consist of (a number of) rules where each, if matches,
// identifes content as belonging to a programming language(s).
type Heuristic interface {
Matcher
Languages() []string
}
// Matcher checks if the data matches (number of) pattern.
// Every heuristic rule below implements this interface.
// A regexp.Regexp satisfies this interface and can be used instead.
type Matcher interface {
Match(data []byte) bool
}
// languages struct incapsulate data common to every Matcher: all languages
// that it identifies.
type languages struct {
langs []string
}
// Languages returns all languages, identified by this Matcher.
func (l languages) Languages() []string {
return l.langs
}
// MatchingLanguages is a helper to create new languages.
func MatchingLanguages(langs ...string) languages {
return languages{langs}
}
// Implements a Heuristic.
type or struct {
languages
pattern Matcher
}
// Or rule matches, if a single matching pattern exists.
// It recives only one pattern as it relies on compile-time optimization that
// represtes union with | inside a single regexp.
func Or(l languages, r Matcher) Heuristic {
return or{l, r}
}
// Match implements rule.Matcher.
func (r or) Match(data []byte) bool {
return r.pattern.Match(data)
}
// Implements a Heuristic.
type and struct {
languages
patterns []Matcher
}
// And rule matches, if each of the patterns does match.
func And(l languages, m ...Matcher) Heuristic {
return and{l, m}
}
// Match implements data.Matcher.
func (r and) Match(data []byte) bool {
for _, p := range r.patterns {
if !p.Match(data) {
return false
}
}
return true
}
// Implements a Heuristic.
type not struct {
languages
Patterns []Matcher
}
// Not rule matches if none of the patterns match.
func Not(l languages, r ...Matcher) Heuristic {
return not{l, r}
}
// Match implements data.Matcher.
func (r not) Match(data []byte) bool {
for _, p := range r.Patterns {
if p.Match(data) {
return false
}
}
return true
}
// Implements a Heuristic.
type always struct {
languages
}
// Always rule always matches. Often is used as a default fallback.
func Always(l languages) Heuristic {
return always{l}
}
// Match implements Matcher.
func (r always) Match(data []byte) bool {
return true
}

39
data/rule/rule_test.go Normal file
View File

@ -0,0 +1,39 @@
package rule
import (
"regexp"
"testing"
"github.com/stretchr/testify/assert"
)
const lang = "ActionScript"
var fixtures = []struct {
name string
rule Heuristic
numLangs int
matching string
noMatch string
}{
{"Always", Always(MatchingLanguages(lang)), 1, "a", ""},
{"Not", Not(MatchingLanguages(lang), regexp.MustCompile(`a`)), 1, "b", "a"},
{"And", And(MatchingLanguages(lang), regexp.MustCompile(`a`), regexp.MustCompile(`b`)), 1, "ab", "a"},
{"Or", Or(MatchingLanguages(lang), regexp.MustCompile(`a|b`)), 1, "ab", "c"},
}
func TestRules(t *testing.T) {
for _, f := range fixtures {
t.Run(f.name, func(t *testing.T) {
assert.NotNil(t, f.rule)
assert.NotNil(t, f.rule.Languages())
assert.Equal(t, f.numLangs, len(f.rule.Languages()))
assert.Truef(t, f.rule.Match([]byte(f.matching)),
"'%s' is expected to .Match() by rule %s%v", f.matching, f.name, f.rule)
if f.noMatch != "" {
assert.Falsef(t, f.rule.Match([]byte(f.noMatch)),
"'%s' is expected NOT to .Match() by rule %s%v", f.noMatch, f.name, f.rule)
}
})
}
}

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -21,282 +21,302 @@ var LanguagesType = map[string]int{
"Agda": 2,
"Alloy": 2,
"Alpine Abuild": 2,
"AngelScript": 2,
"Ant Build System": 1,
"ApacheConf": 1,
"Apex": 2,
"Apollo Guidance Computer": 2,
"AppleScript": 2,
"Arc": 2,
"Arduino": 2,
"AsciiDoc": 4,
"AspectJ": 2,
"Assembly": 2,
"Augeas": 2,
"AutoHotkey": 2,
"AutoIt": 2,
"Awk": 2,
"Ballerina": 2,
"Batchfile": 2,
"Befunge": 2,
"Bison": 2,
"BitBake": 2,
"Blade": 3,
"BlitzBasic": 2,
"BlitzMax": 2,
"Bluespec": 2,
"Boo": 2,
"Brainfuck": 2,
"Brightscript": 2,
"Bro": 2,
"C": 2,
"C#": 2,
"C++": 2,
"C-ObjDump": 1,
"C2hs Haskell": 2,
"CLIPS": 2,
"CMake": 2,
"COBOL": 2,
"COLLADA": 1,
"CSON": 1,
"CSS": 3,
"CSV": 1,
"CWeb": 2,
"Cap'n Proto": 2,
"CartoCSS": 2,
"Ceylon": 2,
"Chapel": 2,
"Charity": 2,
"ChucK": 2,
"Cirru": 2,
"Clarion": 2,
"Clean": 2,
"Click": 2,
"Clojure": 2,
"Closure Templates": 3,
"CoffeeScript": 2,
"ColdFusion": 2,
"ColdFusion CFC": 2,
"Common Lisp": 2,
"Component Pascal": 2,
"Cool": 2,
"Coq": 2,
"Cpp-ObjDump": 1,
"Creole": 4,
"Crystal": 2,
"Csound": 2,
"Csound Document": 2,
"Csound Score": 2,
"Cuda": 2,
"Cycript": 2,
"Cython": 2,
"D": 2,
"D-ObjDump": 1,
"DIGITAL Command Language": 2,
"DM": 2,
"DNS Zone": 1,
"DTrace": 2,
"Darcs Patch": 1,
"Dart": 2,
"DataWeave": 2,
"Diff": 1,
"Dockerfile": 1,
"Dogescript": 2,
"Dylan": 2,
"E": 2,
"EBNF": 1,
"ECL": 2,
"ECLiPSe": 2,
"EJS": 3,
"EQ": 2,
"Eagle": 1,
"Easybuild": 1,
"Ecere Projects": 1,
"Eiffel": 2,
"Elixir": 2,
"Elm": 2,
"Emacs Lisp": 2,
"EmberScript": 2,
"Erlang": 2,
"F#": 2,
"FLUX": 2,
"Factor": 2,
"Fancy": 2,
"Fantom": 2,
"Filebench WML": 2,
"Filterscript": 2,
"Formatted": 1,
"Forth": 2,
"Fortran": 2,
"FreeMarker": 2,
"Frege": 2,
"G-code": 1,
"GAMS": 2,
"GAP": 2,
"Apollo Guidance Computer": 2,
"AppleScript": 2,
"Arc": 2,
"AsciiDoc": 4,
"AspectJ": 2,
"Assembly": 2,
"Asymptote": 2,
"Augeas": 2,
"AutoHotkey": 2,
"AutoIt": 2,
"Awk": 2,
"Ballerina": 2,
"Batchfile": 2,
"Befunge": 2,
"Bison": 2,
"BitBake": 2,
"Blade": 3,
"BlitzBasic": 2,
"BlitzMax": 2,
"Bluespec": 2,
"Boo": 2,
"Brainfuck": 2,
"Brightscript": 2,
"Bro": 2,
"C": 2,
"C#": 2,
"C++": 2,
"C-ObjDump": 1,
"C2hs Haskell": 2,
"CLIPS": 2,
"CMake": 2,
"COBOL": 2,
"COLLADA": 1,
"CSON": 1,
"CSS": 3,
"CSV": 1,
"CWeb": 2,
"Cap'n Proto": 2,
"CartoCSS": 2,
"Ceylon": 2,
"Chapel": 2,
"Charity": 2,
"ChucK": 2,
"Cirru": 2,
"Clarion": 2,
"Clean": 2,
"Click": 2,
"Clojure": 2,
"Closure Templates": 3,
"Cloud Firestore Security Rules": 1,
"CoNLL-U": 1,
"CoffeeScript": 2,
"ColdFusion": 2,
"ColdFusion CFC": 2,
"Common Lisp": 2,
"Common Workflow Language": 2,
"Component Pascal": 2,
"Cool": 2,
"Coq": 2,
"Cpp-ObjDump": 1,
"Creole": 4,
"Crystal": 2,
"Csound": 2,
"Csound Document": 2,
"Csound Score": 2,
"Cuda": 2,
"Cycript": 2,
"Cython": 2,
"D": 2,
"D-ObjDump": 1,
"DIGITAL Command Language": 2,
"DM": 2,
"DNS Zone": 1,
"DTrace": 2,
"Darcs Patch": 1,
"Dart": 2,
"DataWeave": 2,
"Diff": 1,
"Dockerfile": 2,
"Dogescript": 2,
"Dylan": 2,
"E": 2,
"EBNF": 1,
"ECL": 2,
"ECLiPSe": 2,
"EJS": 3,
"EML": 1,
"EQ": 2,
"Eagle": 1,
"Easybuild": 1,
"Ecere Projects": 1,
"Edje Data Collection": 1,
"Eiffel": 2,
"Elixir": 2,
"Elm": 2,
"Emacs Lisp": 2,
"EmberScript": 2,
"Erlang": 2,
"F#": 2,
"F*": 2,
"FIGlet Font": 1,
"FLUX": 2,
"Factor": 2,
"Fancy": 2,
"Fantom": 2,
"Filebench WML": 2,
"Filterscript": 2,
"Formatted": 1,
"Forth": 2,
"Fortran": 2,
"FreeMarker": 2,
"Frege": 2,
"G-code": 1,
"GAMS": 2,
"GAP": 2,
"GCC Machine Description": 2,
"GDB": 2,
"GDScript": 2,
"GLSL": 2,
"GN": 1,
"Game Maker Language": 2,
"Genie": 2,
"Genshi": 2,
"Gentoo Ebuild": 2,
"Gentoo Eclass": 2,
"Gerber Image": 1,
"Gettext Catalog": 4,
"Gherkin": 2,
"Glyph": 2,
"Gnuplot": 2,
"Go": 2,
"Golo": 2,
"Gosu": 2,
"Grace": 2,
"Gradle": 1,
"Grammatical Framework": 2,
"Graph Modeling Language": 1,
"GraphQL": 1,
"Graphviz (DOT)": 1,
"Groovy": 2,
"Groovy Server Pages": 2,
"HCL": 2,
"HLSL": 2,
"HTML": 3,
"HTML+Django": 3,
"HTML+ECR": 3,
"HTML+EEX": 3,
"HTML+ERB": 3,
"HTML+PHP": 3,
"HTTP": 1,
"Hack": 2,
"Haml": 3,
"Handlebars": 3,
"Harbour": 2,
"Haskell": 2,
"Haxe": 2,
"Hy": 2,
"HyPhy": 2,
"IDL": 2,
"IGOR Pro": 2,
"INI": 1,
"IRC log": 1,
"Idris": 2,
"Inform 7": 2,
"Inno Setup": 2,
"Io": 2,
"Ioke": 2,
"Isabelle": 2,
"Isabelle ROOT": 2,
"J": 2,
"JFlex": 2,
"JSON": 1,
"JSON5": 1,
"JSONLD": 1,
"JSONiq": 2,
"JSX": 2,
"Jasmin": 2,
"Java": 2,
"Java Server Pages": 2,
"JavaScript": 2,
"Jison": 2,
"Jison Lex": 2,
"Jolie": 2,
"Julia": 2,
"Jupyter Notebook": 3,
"KRL": 2,
"KiCad Layout": 1,
"KiCad Legacy Layout": 1,
"KiCad Schematic": 1,
"Kit": 3,
"Kotlin": 2,
"LFE": 2,
"LLVM": 2,
"LOLCODE": 2,
"LSL": 2,
"LabVIEW": 2,
"Lasso": 2,
"Latte": 3,
"Lean": 2,
"Less": 3,
"Lex": 2,
"LilyPond": 2,
"Limbo": 2,
"Linker Script": 1,
"Linux Kernel Module": 1,
"Liquid": 3,
"Literate Agda": 2,
"Literate CoffeeScript": 2,
"Literate Haskell": 2,
"LiveScript": 2,
"Logos": 2,
"Logtalk": 2,
"LookML": 2,
"LoomScript": 2,
"Lua": 2,
"M": 2,
"M4": 2,
"M4Sugar": 2,
"MAXScript": 2,
"MQL4": 2,
"MQL5": 2,
"MTML": 3,
"MUF": 2,
"Makefile": 2,
"Mako": 2,
"Markdown": 4,
"Marko": 3,
"Mask": 3,
"Mathematica": 2,
"Matlab": 2,
"Maven POM": 1,
"Max": 2,
"MediaWiki": 4,
"Mercury": 2,
"Meson": 2,
"Metal": 2,
"MiniD": 2,
"Mirah": 2,
"Modelica": 2,
"Modula-2": 2,
"Module Management System": 2,
"Monkey": 2,
"Moocode": 2,
"MoonScript": 2,
"Myghty": 2,
"NCL": 2,
"NL": 1,
"NSIS": 2,
"Nearley": 2,
"Nemerle": 2,
"NetLinx": 2,
"NetLinx+ERB": 2,
"NetLogo": 2,
"NewLisp": 2,
"Nginx": 1,
"Nim": 2,
"Ninja": 1,
"Nit": 2,
"Nix": 2,
"Nu": 2,
"NumPy": 2,
"OCaml": 2,
"ObjDump": 1,
"Objective-C": 2,
"Objective-C++": 2,
"Objective-J": 2,
"Omgrofl": 2,
"Opa": 2,
"Opal": 2,
"OpenCL": 2,
"OpenEdge ABL": 2,
"OpenRC runscript": 2,
"OpenSCAD": 2,
"OpenType Feature File": 1,
"Game Maker Language": 2,
"Genie": 2,
"Genshi": 2,
"Gentoo Ebuild": 2,
"Gentoo Eclass": 2,
"Gerber Image": 1,
"Gettext Catalog": 4,
"Gherkin": 2,
"Git Attributes": 1,
"Git Config": 1,
"Glyph": 2,
"Glyph Bitmap Distribution Format": 1,
"Gnuplot": 2,
"Go": 2,
"Golo": 2,
"Gosu": 2,
"Grace": 2,
"Gradle": 1,
"Grammatical Framework": 2,
"Graph Modeling Language": 1,
"GraphQL": 1,
"Graphviz (DOT)": 1,
"Groovy": 2,
"Groovy Server Pages": 2,
"HAProxy": 1,
"HCL": 2,
"HLSL": 2,
"HTML": 3,
"HTML+Django": 3,
"HTML+ECR": 3,
"HTML+EEX": 3,
"HTML+ERB": 3,
"HTML+PHP": 3,
"HTML+Razor": 3,
"HTTP": 1,
"HXML": 1,
"Hack": 2,
"Haml": 3,
"Handlebars": 3,
"Harbour": 2,
"Haskell": 2,
"Haxe": 2,
"HiveQL": 2,
"Hy": 2,
"HyPhy": 2,
"IDL": 2,
"IGOR Pro": 2,
"INI": 1,
"IRC log": 1,
"Idris": 2,
"Ignore List": 1,
"Inform 7": 2,
"Inno Setup": 2,
"Io": 2,
"Ioke": 2,
"Isabelle": 2,
"Isabelle ROOT": 2,
"J": 2,
"JFlex": 2,
"JSON": 1,
"JSON with Comments": 1,
"JSON5": 1,
"JSONLD": 1,
"JSONiq": 2,
"JSX": 2,
"Jasmin": 2,
"Java": 2,
"Java Properties": 1,
"Java Server Pages": 2,
"JavaScript": 2,
"Jison": 2,
"Jison Lex": 2,
"Jolie": 2,
"Julia": 2,
"Jupyter Notebook": 3,
"KRL": 2,
"KiCad Layout": 1,
"KiCad Legacy Layout": 1,
"KiCad Schematic": 1,
"Kit": 3,
"Kotlin": 2,
"LFE": 2,
"LLVM": 2,
"LOLCODE": 2,
"LSL": 2,
"LTspice Symbol": 1,
"LabVIEW": 2,
"Lasso": 2,
"Latte": 3,
"Lean": 2,
"Less": 3,
"Lex": 2,
"LilyPond": 2,
"Limbo": 2,
"Linker Script": 1,
"Linux Kernel Module": 1,
"Liquid": 3,
"Literate Agda": 2,
"Literate CoffeeScript": 2,
"Literate Haskell": 2,
"LiveScript": 2,
"Logos": 2,
"Logtalk": 2,
"LookML": 2,
"LoomScript": 2,
"Lua": 2,
"M": 2,
"M4": 2,
"M4Sugar": 2,
"MATLAB": 2,
"MAXScript": 2,
"MQL4": 2,
"MQL5": 2,
"MTML": 3,
"MUF": 2,
"Makefile": 2,
"Mako": 2,
"Markdown": 4,
"Marko": 3,
"Mask": 3,
"Mathematica": 2,
"Maven POM": 1,
"Max": 2,
"MediaWiki": 4,
"Mercury": 2,
"Meson": 2,
"Metal": 2,
"MiniD": 2,
"Mirah": 2,
"Modelica": 2,
"Modula-2": 2,
"Modula-3": 2,
"Module Management System": 2,
"Monkey": 2,
"Moocode": 2,
"MoonScript": 2,
"Myghty": 2,
"NCL": 2,
"NL": 1,
"NSIS": 2,
"Nearley": 2,
"Nemerle": 2,
"NetLinx": 2,
"NetLinx+ERB": 2,
"NetLogo": 2,
"NewLisp": 2,
"Nextflow": 2,
"Nginx": 1,
"Nim": 2,
"Ninja": 1,
"Nit": 2,
"Nix": 2,
"Nu": 2,
"NumPy": 2,
"OCaml": 2,
"ObjDump": 1,
"Objective-C": 2,
"Objective-C++": 2,
"Objective-J": 2,
"Omgrofl": 2,
"Opa": 2,
"Opal": 2,
"OpenCL": 2,
"OpenEdge ABL": 2,
"OpenRC runscript": 2,
"OpenSCAD": 2,
"OpenType Feature File": 1,
"Org": 4,
"Ox": 2,
"Oxygene": 2,
"Oz": 2,
"P4": 2,
"PAWN": 2,
"PHP": 2,
"PLSQL": 2,
"PLpgSQL": 2,
@ -307,6 +327,7 @@ var LanguagesType = map[string]int{
"Parrot Assembly": 2,
"Parrot Internal Representation": 2,
"Pascal": 2,
"Pawn": 2,
"Pep8": 2,
"Perl": 2,
"Perl 6": 2,
@ -316,8 +337,10 @@ var LanguagesType = map[string]int{
"PigLatin": 2,
"Pike": 2,
"Pod": 4,
"Pod 6": 4,
"PogoScript": 2,
"Pony": 2,
"PostCSS": 3,
"PostScript": 3,
"PowerBuilder": 2,
"PowerShell": 2,
@ -336,6 +359,7 @@ var LanguagesType = map[string]int{
"Python traceback": 1,
"QML": 2,
"QMake": 2,
"Quake": 2,
"R": 2,
"RAML": 3,
"RDoc": 4,
@ -343,6 +367,7 @@ var LanguagesType = map[string]int{
"REXX": 2,
"RHTML": 3,
"RMarkdown": 4,
"RPC": 2,
"RPM Spec": 1,
"RUNOFF": 3,
"Racket": 2,
@ -359,6 +384,7 @@ var LanguagesType = map[string]int{
"Ring": 2,
"RobotFramework": 2,
"Roff": 3,
"Roff Manpage": 3,
"Rouge": 2,
"Ruby": 2,
"Rust": 2,
@ -385,10 +411,12 @@ var LanguagesType = map[string]int{
"ShellSession": 2,
"Shen": 2,
"Slash": 2,
"Slice": 2,
"Slim": 3,
"Smali": 2,
"Smalltalk": 2,
"Smarty": 2,
"Solidity": 2,
"SourcePawn": 2,
"Spline Font Database": 1,
"Squirrel": 2,
@ -397,7 +425,7 @@ var LanguagesType = map[string]int{
"Stata": 2,
"Stylus": 3,
"SubRip Text": 1,
"Sublime Text Config": 1,
"SugarSS": 3,
"SuperCollider": 2,
"Swift": 2,
"SystemVerilog": 2,
@ -437,32 +465,42 @@ var LanguagesType = map[string]int{
"Web Ontology Language": 1,
"WebAssembly": 2,
"WebIDL": 2,
"Windows Registry Entries": 1,
"World of Warcraft Addon Data": 1,
"X10": 2,
"XC": 2,
"XCompose": 1,
"XML": 1,
"XPM": 1,
"XPages": 1,
"XProc": 2,
"XQuery": 2,
"XS": 2,
"XSLT": 2,
"Xojo": 2,
"Xtend": 2,
"YAML": 1,
"YANG": 1,
"Yacc": 2,
"Zephir": 2,
"Zimpl": 2,
"desktop": 1,
"eC": 2,
"edn": 1,
"fish": 2,
"mupad": 2,
"nesC": 2,
"ooc": 2,
"reStructuredText": 4,
"wisp": 2,
"xBase": 2,
"X BitMap": 1,
"X Font Directory Index": 1,
"X PixMap": 1,
"X10": 2,
"XC": 2,
"XCompose": 1,
"XML": 1,
"XPages": 1,
"XProc": 2,
"XQuery": 2,
"XS": 2,
"XSLT": 2,
"Xojo": 2,
"Xtend": 2,
"YAML": 1,
"YANG": 1,
"YARA": 2,
"YASnippet": 3,
"Yacc": 2,
"Zephir": 2,
"Zig": 2,
"Zimpl": 2,
"desktop": 1,
"eC": 2,
"edn": 1,
"fish": 2,
"mupad": 2,
"nanorc": 1,
"nesC": 2,
"ooc": 2,
"q": 2,
"reStructuredText": 4,
"sed": 2,
"wdl": 2,
"wisp": 2,
"xBase": 2,
}

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: 4cd558c37482e8d2c535d8107f2d11b49afbc5b5
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -10,7 +10,6 @@ var VendorMatchers = substring.Or(
substring.Regexp(`^[Dd]ependencies/`),
substring.Regexp(`(^|/)dist/`),
substring.Regexp(`^deps/`),
substring.Regexp(`^tools/`),
substring.Regexp(`(^|/)configure$`),
substring.Regexp(`(^|/)config.guess$`),
substring.Regexp(`(^|/)config.sub$`),
@ -32,13 +31,15 @@ var VendorMatchers = substring.Or(
substring.Regexp(`(^|/)bootstrap([^.]*)\.(js|css|less|scss|styl)$`),
substring.Regexp(`(^|/)custom\.bootstrap([^\s]*)(js|css|less|scss|styl)$`),
substring.Regexp(`(^|/)font-awesome\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)font-awesome/.*\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)foundation\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)normalize\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)skeleton\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)[Bb]ourbon/.*\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)animate\.(css|less|scss|styl)$`),
substring.Regexp(`third[-_]?party/`),
substring.Regexp(`3rd[-_]?party/`),
substring.Regexp(`(^|/)materialize\.(css|less|scss|styl|js)$`),
substring.Regexp(`(^|/)select2/.*\.(css|scss|js)$`),
substring.Regexp(`(3rd|[Tt]hird)[-_]?[Pp]arty/`),
substring.Regexp(`vendors?/`),
substring.Regexp(`extern(al)?/`),
substring.Regexp(`(^|/)[Vv]+endor/`),
@ -53,6 +54,9 @@ var VendorMatchers = substring.Or(
substring.Regexp(`jquery.fancybox.(js|css)`),
substring.Regexp(`fuelux.js`),
substring.Regexp(`(^|/)jquery\.fileupload(-\w+)?\.js$`),
substring.Regexp(`jquery.dataTables.js`),
substring.Regexp(`bootbox.js`),
substring.Regexp(`pdf.worker.js`),
substring.Regexp(`(^|/)slick\.\w+.js$`),
substring.Regexp(`(^|/)Leaflet\.Coordinates-\d+\.\d+\.\d+\.src\.js$`),
substring.Regexp(`leaflet.draw-src.js`),
@ -63,6 +67,7 @@ var VendorMatchers = substring.Or(
substring.Regexp(`wicket-leaflet.js`),
substring.Regexp(`.sublime-project`),
substring.Regexp(`.sublime-workspace`),
substring.Regexp(`.vscode`),
substring.Regexp(`(^|/)prototype(.*)\.js$`),
substring.Regexp(`(^|/)effects\.js$`),
substring.Regexp(`(^|/)controls\.js$`),
@ -99,8 +104,7 @@ var VendorMatchers = substring.Or(
substring.Regexp(`^.osx$`),
substring.Regexp(`\.xctemplate/`),
substring.Regexp(`\.imageset/`),
substring.Regexp(`^Carthage/`),
substring.Regexp(`^Pods/`),
substring.Regexp(`(^|/)Carthage/`),
substring.Regexp(`(^|/)Sparkle/`),
substring.Regexp(`Crashlytics.framework/`),
substring.Regexp(`Fabric.framework/`),
@ -113,6 +117,9 @@ var VendorMatchers = substring.Or(
substring.Regexp(`(^|/)gradlew$`),
substring.Regexp(`(^|/)gradlew\.bat$`),
substring.Regexp(`(^|/)gradle/wrapper/`),
substring.Regexp(`(^|/)mvnw$`),
substring.Regexp(`(^|/)mvnw\.cmd$`),
substring.Regexp(`(^|/)\.mvn/wrapper/`),
substring.Regexp(`-vsdoc\.js$`),
substring.Regexp(`\.intellisense\.js$`),
substring.Regexp(`(^|/)jquery([^.]*)\.validate(\.unobtrusive)?\.js$`),

View File

@ -1,9 +1,34 @@
package data
// LanguagesByAlias keeps alias for different languages and use the name of the languages as an alias too.
import "strings"
// LanguageByAliasMap keeps alias for different languages and use the name of the languages as an alias too.
// All the keys (alias or not) are written in lower case and the whitespaces has been replaced by underscores.
var LanguagesByAlias = map[string]string{
var LanguageByAliasMap = map[string]string{
{{range $alias, $language := . -}}
"{{ $alias }}": {{ printf "%q" $language -}},
"{{ $alias }}": {{ printf "%q" $language -}},
{{end -}}
}
// LanguageByAlias looks up the language name by it's alias or name.
// It mirrors the logic of github linguist and is needed e.g for heuristcs.yml
// that mixes names and aliases in a language field (see XPM example).
func LanguageByAlias(langOrAlias string) (lang string, ok bool) {
k := convertToAliasKey(langOrAlias)
lang, ok = LanguageByAliasMap[k]
return
}
// convertToAliasKey converts language name to a key in LanguageByAliasMap.
// Following
// - internal.code-generator.generator.convertToAliasKey()
// - GetLanguageByAlias()
// conventions.
// It is here to avoid dependency on "generate" and "enry" packages.
func convertToAliasKey(langName string) string {
ak := strings.SplitN(langName, `,`, 2)[0]
ak = strings.Replace(ak, ` `, `_`, -1)
ak = strings.ToLower(ak)
return ak
}

View File

@ -1,33 +1,51 @@
package data
import "gopkg.in/toqueteos/substring.v1"
import (
"regexp"
type languageMatcher func ([]byte) []string
"gopkg.in/src-d/enry.v1/data/rule"
)
var ContentMatchers = map[string]languageMatcher{
{{ range $index, $disambiguator := . -}}
{{ printf "%q" $disambiguator.Extension }}: func(i []byte) []string {
{{ range $i, $language := $disambiguator.Languages -}}
{{- if not (avoidLanguage $language) }}
{{- if gt (len $language.Heuristics) 0 }}
{{- if gt $i 0 }} else {{ end -}}
if {{- range $j, $heuristic := $language.Heuristics }} {{ $heuristic.Name }}.Match(string(i))
{{- if lt $j (len $language.LogicRelations) }} {{index $language.LogicRelations $j}} {{- end -}} {{ end }} {
return []string{ {{- printf "%q" $language.Language -}} }
}
{{- end -}}
{{- end -}}
{{- end}}
return {{ returnLanguages $disambiguator.Languages | returnStringSlice }}
},
var ContentHeuristics = map[string]*Heuristics{
{{ range $ext, $rules := . -}}
{{ printf "%q" $ext }}: &Heuristics{
{{ range $rule := $rules -}}
{{template "Rule" $rule}}
{{ end -}}
},
{{ end -}}
}
var (
{{ range $index, $heuristic := getAllHeuristics . -}}
{{ $heuristic.Name }} = substring.Regexp(`{{ $heuristic.Regexp }}`)
{{ end -}}
)
{{ define "Rule" -}}
{{ if eq .Op "And" -}}
rule.And(
{{ template "Languages" .Langs -}}
{{ range $rule := .Rules -}}
{{template "Rule" $rule}}
{{ end -}}
),
{{- else if eq .Op "Or" -}}
rule.Or(
{{ template "Languages" .Langs -}}
regexp.MustCompile(`{{ .Pattern }}`),
),
{{- else if eq .Op "Not" -}}
rule.Not(
{{ template "Languages" .Langs -}}
regexp.MustCompile(`{{ .Pattern }}`),
),
{{- else if eq .Op "Always" -}}
rule.Always(
{{ template "Languages" .Langs -}}
),
{{ end -}}
{{ end -}}
{{define "Languages" -}}
{{with . -}}
rule.MatchingLanguages( {{range .}} {{printf "\"%s\"" .}}, {{end}} ),
{{ else -}}
rule.MatchingLanguages(""),
{{end -}}
{{end}}

View File

@ -2,10 +2,11 @@ package generator
import (
"bytes"
"gopkg.in/yaml.v2"
"io"
"io/ioutil"
"strings"
"gopkg.in/yaml.v2"
)
// Aliases reads from fileToParse and builds source file from tmplPath. It complies with type File signature.
@ -21,10 +22,10 @@ func Aliases(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string
}
orderedLangList := getAlphabeticalOrderedKeys(languages)
languagesByAlias := buildAliasLanguageMap(languages, orderedLangList)
languageByAlias := buildAliasLanguageMap(languages, orderedLangList)
buf := &bytes.Buffer{}
if err := executeAliasesTemplate(buf, languagesByAlias, tmplPath, tmplName, commit); err != nil {
if err := executeAliasesTemplate(buf, languageByAlias, tmplPath, tmplName, commit); err != nil {
return err
}
@ -52,6 +53,6 @@ func convertToAliasKey(s string) (key string) {
return
}
func executeAliasesTemplate(out io.Writer, languagesByAlias map[string]string, aliasesTmplPath, aliasesTmpl, commit string) error {
return executeTemplate(out, aliasesTmpl, aliasesTmplPath, commit, nil, languagesByAlias)
func executeAliasesTemplate(out io.Writer, languageByAlias map[string]string, aliasesTmplPath, aliasesTmpl, commit string) error {
return executeTemplate(out, aliasesTmpl, aliasesTmplPath, commit, nil, languageByAlias)
}

View File

@ -2,13 +2,14 @@ package generator
import (
"bytes"
"gopkg.in/yaml.v2"
"io"
"io/ioutil"
"gopkg.in/yaml.v2"
)
// Documentation reads from fileToParse and builds source file from tmplPath. It complies with type File signature.
func Documentation(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string) error {
// Documentation generates regex matchers in Go for documentation files/dirs.
// It is of generator.File type.
func Documentation(fileToParse, _, outFile, tmplPath, tmplName, commit string) error {
data, err := ioutil.ReadFile(fileToParse)
if err != nil {
return err
@ -20,13 +21,10 @@ func Documentation(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit
}
buf := &bytes.Buffer{}
if err := executeDocumentationTemplate(buf, regexpList, tmplPath, tmplName, commit); err != nil {
err = executeTemplate(buf, tmplName, tmplPath, commit, nil, regexpList)
if err != nil {
return err
}
return formatedWrite(outPath, buf.Bytes())
}
func executeDocumentationTemplate(out io.Writer, regexpList []string, tmplPath, tmplName, commit string) error {
return executeTemplate(out, tmplName, tmplPath, commit, nil, regexpList)
return formatedWrite(outFile, buf.Bytes())
}

View File

@ -1,3 +1,5 @@
// Package generator provides facilities to generate Go code for the
// package data in enry from YAML files describing supported languages in Linguist.
package generator
import (
@ -9,7 +11,10 @@ import (
"text/template"
)
// File is the function's type that generate source file from a file to be parsed, linguist's samples dir and a template.
// File is a common type for all generator functions.
// It generates Go source code file based on template in tmplPath,
// by parsing the data in fileToParse and linguist's samplesDir
// saving results to an outFile.
type File func(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string) error
func formatedWrite(outPath string, source []byte) error {
@ -28,16 +33,14 @@ func executeTemplate(w io.Writer, name, path, commit string, fmap template.FuncM
return commit
}
buf := bytes.NewBuffer(nil)
const headerTmpl = "header.go.tmpl"
headerPath := filepath.Join(filepath.Dir(path), headerTmpl)
h := template.Must(template.New(headerTmpl).Funcs(template.FuncMap{
"getCommit": getCommit,
}).ParseFiles(headerPath))
buf := bytes.NewBuffer(nil)
if err := h.Execute(buf, data); err != nil {
return err
}

View File

@ -1,6 +1,7 @@
package generator
import (
"flag"
"fmt"
"io/ioutil"
"os"
@ -15,7 +16,7 @@ import (
const (
linguistURL = "https://github.com/github/linguist.git"
linguistClonedEnvVar = "ENRY_TEST_REPO"
commit = "d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68"
commit = "e4560984058b4726010ca4b8f03ed9d0f8f464db"
samplesDir = "samples"
languagesFile = "lib/linguist/languages.yml"
@ -28,7 +29,7 @@ const (
extensionTestTmplName = "extension.go.tmpl"
// Heuristics test
heuristicsTestFile = "lib/linguist/heuristics.rb"
heuristicsTestFile = "lib/linguist/heuristics.yml"
contentGold = testDir + "/content.gold"
contentTestTmplPath = assetsDir + "/content.go.tmpl"
contentTestTmplName = "content.go.tmpl"
@ -85,13 +86,27 @@ type GeneratorTestSuite struct {
suite.Suite
tmpLinguist string
cloned bool
testCases []testCase
}
func TestGeneratorTestSuite(t *testing.T) {
type testCase struct {
name string
fileToParse string
samplesDir string
tmplPath string
tmplName string
commit string
generate File
wantOut string
}
var updateGold = flag.Bool("update_gold", false, "Update golden test files")
func Test_GeneratorTestSuite(t *testing.T) {
suite.Run(t, new(GeneratorTestSuite))
}
func (s *GeneratorTestSuite) SetupSuite() {
func (s *GeneratorTestSuite) maybeCloneLinguist() {
var err error
s.tmpLinguist = os.Getenv(linguistClonedEnvVar)
s.cloned = s.tmpLinguist == ""
@ -101,40 +116,25 @@ func (s *GeneratorTestSuite) SetupSuite() {
cmd := exec.Command("git", "clone", linguistURL, s.tmpLinguist)
err = cmd.Run()
assert.NoError(s.T(), err)
}
cwd, err := os.Getwd()
assert.NoError(s.T(), err)
cwd, err := os.Getwd()
assert.NoError(s.T(), err)
err = os.Chdir(s.tmpLinguist)
assert.NoError(s.T(), err)
err = os.Chdir(s.tmpLinguist)
assert.NoError(s.T(), err)
cmd := exec.Command("git", "checkout", commit)
err = cmd.Run()
assert.NoError(s.T(), err)
cmd = exec.Command("git", "checkout", commit)
err = cmd.Run()
assert.NoError(s.T(), err)
err = os.Chdir(cwd)
assert.NoError(s.T(), err)
}
func (s *GeneratorTestSuite) TearDownSuite() {
if s.cloned {
err := os.RemoveAll(s.tmpLinguist)
err = os.Chdir(cwd)
assert.NoError(s.T(), err)
}
}
func (s *GeneratorTestSuite) TestGenerationFiles() {
tests := []struct {
name string
fileToParse string
samplesDir string
tmplPath string
tmplName string
commit string
generate File
wantOut string
}{
func (s *GeneratorTestSuite) SetupSuite() {
s.maybeCloneLinguist()
s.testCases = []testCase{
{
name: "Extensions()",
fileToParse: filepath.Join(s.tmpLinguist, languagesFile),
@ -152,7 +152,7 @@ func (s *GeneratorTestSuite) TestGenerationFiles() {
tmplPath: contentTestTmplPath,
tmplName: contentTestTmplName,
commit: commit,
generate: Heuristics,
generate: GenHeuristics,
wantOut: contentGold,
},
{
@ -244,8 +244,35 @@ func (s *GeneratorTestSuite) TestGenerationFiles() {
wantOut: mimeTypeGold,
},
}
}
for _, test := range tests {
func (s *GeneratorTestSuite) TearDownSuite() {
if s.cloned {
err := os.RemoveAll(s.tmpLinguist)
if err != nil {
s.T().Logf("Failed to clean up %s after the test.\n", s.tmpLinguist)
}
}
}
// TestUpdateGeneratorTestSuiteGold is a Gold results generation automation.
// It should only be enabled&run manually on every new Linguist version
// to update *.gold files.
func (s *GeneratorTestSuite) TestUpdateGeneratorTestSuiteGold() {
if !*updateGold {
s.T().Skip()
}
s.T().Logf("Generating new *.gold test files")
for _, test := range s.testCases {
dst := test.wantOut
s.T().Logf("Generating %s from %s\n", dst, test.fileToParse)
err := test.generate(test.fileToParse, test.samplesDir, dst, test.tmplPath, test.tmplName, test.commit)
assert.NoError(s.T(), err)
}
}
func (s *GeneratorTestSuite) TestGenerationFiles() {
for _, test := range s.testCases {
gold, err := ioutil.ReadFile(test.wantOut)
assert.NoError(s.T(), err)

View File

@ -1,483 +1,176 @@
package generator
import (
"bufio"
"bytes"
"fmt"
"io"
"io/ioutil"
"strconv"
"log"
"strings"
"text/template"
"gopkg.in/src-d/enry.v1/regex"
yaml "gopkg.in/yaml.v2"
)
// Heuristics reads from fileToParse and builds source file from tmplPath. It complies with type File signature.
func Heuristics(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string) error {
data, err := ioutil.ReadFile(fileToParse)
const (
multilinePrefix = "(?m)"
orPipe = "|"
)
// GenHeuristics generates language identification heuristics in Go.
// It is of generator.File type.
func GenHeuristics(fileToParse, _, outPath, tmplPath, tmplName, commit string) error {
heuristicsYaml, err := parseYaml(fileToParse)
if err != nil {
return err
}
disambiguators, err := getDisambiguators(data)
langPatterns, err := loadHeuristics(heuristicsYaml)
if err != nil {
return err
}
buf := &bytes.Buffer{}
if err := executeContentTemplate(buf, disambiguators, tmplPath, tmplName, commit); err != nil {
err = executeTemplate(buf, tmplName, tmplPath, commit, nil, langPatterns)
if err != nil {
return err
}
return formatedWrite(outPath, buf.Bytes())
}
const (
unknownLanguage = "OtherLanguage"
emptyFile = "^$"
)
var (
disambLine = regex.MustCompile(`^(\s*)disambiguate`)
definedRegs = make(map[string]string)
illegalCharacter = map[string]string{
"#": "Sharp",
"+": "Plus",
"-": "Dash",
}
)
type disambiguator struct {
Extension string `json:"extension,omitempty"`
Languages []*languageHeuristics `json:"languages,omitempty"`
}
func (d *disambiguator) setHeuristicsNames() {
for _, lang := range d.Languages {
for i, heuristic := range lang.Heuristics {
name := buildName(d.Extension, lang.Language, i)
heuristic.Name = name
}
}
}
func buildName(extension, language string, id int) string {
extension = strings.TrimPrefix(extension, `.`)
language = strings.Join(strings.Fields(language), ``)
name := strings.Join([]string{extension, language, "Matcher", strconv.Itoa(id)}, `_`)
for k, v := range illegalCharacter {
if strings.Contains(name, k) {
name = strings.Replace(name, k, v, -1)
}
}
return name
}
type languageHeuristics struct {
Language string `json:"language,omitempty"`
Heuristics []*heuristic `json:"heuristics,omitempty"`
LogicRelations []string `json:"logic_relations,omitempty"`
}
func (l *languageHeuristics) clone() (*languageHeuristics, error) {
language := l.Language
logicRels := make([]string, len(l.LogicRelations))
if copy(logicRels, l.LogicRelations) != len(l.LogicRelations) {
return nil, fmt.Errorf("error copying logic relations")
}
heuristics := make([]*heuristic, 0, len(l.Heuristics))
for _, h := range l.Heuristics {
heuristic := *h
heuristics = append(heuristics, &heuristic)
}
clone := &languageHeuristics{
Language: language,
Heuristics: heuristics,
LogicRelations: logicRels,
}
return clone, nil
}
type heuristic struct {
Name string `json:"name,omitempty"`
Regexp string `json:"regexp,omitempty"`
}
// A disambiguate block looks like:
// disambiguate ".mod", ".extension" do |data|
// if data.include?('<!ENTITY ') && data.include?('patata')
// Language["XML"]
// elsif /^\s*MODULE [\w\.]+;/i.match(data) || /^\s*END [\w\.]+;/i.match(data) || data.empty?
// Language["Modula-2"]
// elsif (/^\s*import (scala|java)\./.match(data) || /^\s*val\s+\w+\s*=/.match(data) || /^\s*class\b/.match(data))
// Language["Scala"]
// elsif (data.include?("gap> "))
// Language["GAP"]
// else
// [Language["Linux Kernel Module"], Language["AMPL"]]
// end
// end
func getDisambiguators(heuristics []byte) ([]*disambiguator, error) {
seenExtensions := map[string]bool{}
buf := bufio.NewScanner(bytes.NewReader(heuristics))
disambiguators := make([]*disambiguator, 0, 50)
for buf.Scan() {
line := buf.Text()
if disambLine.MatchString(line) {
d, err := parseDisambiguators(line, buf, seenExtensions)
if err != nil {
return nil, err
}
disambiguators = append(disambiguators, d...)
}
lookForRegexpVariables(line)
}
if err := buf.Err(); err != nil {
return nil, err
}
return disambiguators, nil
}
func lookForRegexpVariables(line string) {
if strings.Contains(line, "ObjectiveCRegex = ") {
line = strings.TrimSpace(line)
reg := strings.TrimPrefix(line, "ObjectiveCRegex = ")
definedRegs["ObjectiveCRegex"] = reg
}
if strings.Contains(line, "fortran_rx = ") {
line = strings.TrimSpace(line)
reg := strings.TrimPrefix(line, "fortran_rx = ")
definedRegs["fortran_rx"] = reg
}
}
func parseDisambiguators(line string, buf *bufio.Scanner, seenExtensions map[string]bool) ([]*disambiguator, error) {
disambList := make([]*disambiguator, 0, 2)
splitted := strings.Fields(line)
for _, v := range splitted {
if strings.HasPrefix(v, `"`) {
extension := strings.Trim(v, `",`)
if _, ok := seenExtensions[extension]; !ok {
d := &disambiguator{Extension: extension}
disambList = append(disambList, d)
seenExtensions[extension] = true
// loadHeuristics transforms parsed YAML to map[".ext"]->IR for code generation.
func loadHeuristics(yaml *Heuristics) (map[string][]*LanguagePattern, error) {
var patterns = make(map[string][]*LanguagePattern)
for _, disambiguation := range yaml.Disambiguations {
var rules []*LanguagePattern
for _, rule := range disambiguation.Rules {
langPattern := loadRule(yaml.NamedPatterns, rule)
if langPattern != nil {
rules = append(rules, langPattern)
}
}
// unroll to a single map
for _, ext := range disambiguation.Extensions {
if _, ok := patterns[ext]; ok {
return nil, fmt.Errorf("cannt add extension '%s', it already exists for %q", ext, patterns[ext])
}
patterns[ext] = rules
}
}
return patterns, nil
}
// loadRule transforms single rule from parsed YAML to IR for code generation.
// For OrPattern case, it always combines multiple patterns into a single one.
func loadRule(namedPatterns map[string]StringArray, rule *Rule) *LanguagePattern {
var result *LanguagePattern
if len(rule.And) != 0 { // AndPattern
var subPatterns []*LanguagePattern
for _, r := range rule.And {
subp := loadRule(namedPatterns, r)
subPatterns = append(subPatterns, subp)
}
result = &LanguagePattern{"And", rule.Languages, "", subPatterns}
} else if len(rule.Pattern) != 0 { // OrPattern
conjunction := strings.Join(rule.Pattern, orPipe)
pattern := convertToValidRegexp(conjunction)
result = &LanguagePattern{"Or", rule.Languages, pattern, nil}
} else if rule.NegativePattern != "" { // NotPattern
pattern := convertToValidRegexp(rule.NegativePattern)
result = &LanguagePattern{"Not", rule.Languages, pattern, nil}
} else if rule.NamedPattern != "" { // Named OrPattern
conjunction := strings.Join(namedPatterns[rule.NamedPattern], orPipe)
pattern := convertToValidRegexp(conjunction)
result = &LanguagePattern{"Or", rule.Languages, pattern, nil}
} else { // AlwaysPattern
result = &LanguagePattern{"Always", rule.Languages, "", nil}
}
langsHeuristics, err := getLanguagesHeuristics(buf)
if isUnsupportedRegexpSyntax(result.Pattern) {
log.Printf("skipping rule: language:'%q', rule:'%q'\n", rule.Languages, result.Pattern)
return nil
}
return result
}
// LanguagePattern is an IR of parsed Rule suitable for code generations.
// Strings are used as this is to be be consumed by text/template.
type LanguagePattern struct {
Op string
Langs []string
Pattern string
Rules []*LanguagePattern
}
type Heuristics struct {
Disambiguations []*Disambiguation
NamedPatterns map[string]StringArray `yaml:"named_patterns"`
}
type Disambiguation struct {
Extensions []string `yaml:"extensions,flow"`
Rules []*Rule `yaml:"rules"`
}
type Rule struct {
Patterns `yaml:",inline"`
Languages StringArray `yaml:"language"`
And []*Rule
}
type Patterns struct {
Pattern StringArray `yaml:"pattern,omitempty"`
NamedPattern string `yaml:"named_pattern,omitempty"`
NegativePattern string `yaml:"negative_pattern,omitempty"`
}
// StringArray is workaround for parsing named_pattern,
// wich is sometimes arry and sometimes not.
// See https://github.com/go-yaml/yaml/issues/100
type StringArray []string
// UnmarshalYAML allowes to parse element always as a []string
func (sa *StringArray) UnmarshalYAML(unmarshal func(interface{}) error) error {
var multi []string
if err := unmarshal(&multi); err != nil {
var single string
if err := unmarshal(&single); err != nil {
return err
}
*sa = []string{single}
} else {
*sa = multi
}
return nil
}
func parseYaml(file string) (*Heuristics, error) {
data, err := ioutil.ReadFile(file)
if err != nil {
return nil, err
}
for i, disamb := range disambList {
lh := langsHeuristics
if i != 0 {
lh = cloneLanguagesHeuristics(langsHeuristics)
}
disamb.Languages = lh
disamb.setHeuristicsNames()
}
return disambList, nil
}
func cloneLanguagesHeuristics(list []*languageHeuristics) []*languageHeuristics {
cloneList := make([]*languageHeuristics, 0, len(list))
for _, langHeu := range list {
clone, _ := langHeu.clone()
cloneList = append(cloneList, clone)
}
return cloneList
}
func getLanguagesHeuristics(buf *bufio.Scanner) ([]*languageHeuristics, error) {
langsList := make([][]string, 0, 2)
heuristicsList := make([][]*heuristic, 0, 1)
logicRelsList := make([][]string, 0, 1)
lastWasMatch := false
for buf.Scan() {
line := buf.Text()
if strings.TrimSpace(line) == "end" {
break
}
if hasRegExp(line) {
line := cleanRegExpLine(line)
logicRels := getLogicRelations(line)
heuristics := getHeuristics(line)
if lastWasMatch {
i := len(heuristicsList) - 1
heuristicsList[i] = append(heuristicsList[i], heuristics...)
i = len(logicRelsList) - 1
logicRelsList[i] = append(logicRelsList[i], logicRels...)
} else {
heuristicsList = append(heuristicsList, heuristics)
logicRelsList = append(logicRelsList, logicRels)
}
lastWasMatch = true
}
if strings.Contains(line, "Language") {
langs := getLanguages(line)
langsList = append(langsList, langs)
lastWasMatch = false
}
}
if err := buf.Err(); err != nil {
h := &Heuristics{}
if err := yaml.Unmarshal(data, &h); err != nil {
return nil, err
}
langsHeuristics := buildLanguagesHeuristics(langsList, heuristicsList, logicRelsList)
return langsHeuristics, nil
return h, nil
}
func hasRegExp(line string) bool {
return strings.Contains(line, ".match") || strings.Contains(line, ".include?") || strings.Contains(line, ".empty?")
// isUnsupportedRegexpSyntax filters regexp syntax that is not supported by RE2.
// In particular, we stumbled up on usage of next cases:
// - named & numbered capturing group/after text matching
// - backreference
// For referece on supported syntax see https://github.com/google/re2/wiki/Syntax
func isUnsupportedRegexpSyntax(reg string) bool {
return strings.Contains(reg, `(?<`) || strings.Contains(reg, `\1`) ||
// See https://github.com/github/linguist/pull/4243#discussion_r246105067
(strings.HasPrefix(reg, multilinePrefix+`/`) && strings.HasSuffix(reg, `/`))
}
func cleanRegExpLine(line string) string {
if strings.Contains(line, "if ") {
line = line[strings.Index(line, `if `)+3:]
}
line = strings.TrimSpace(line)
line = strings.TrimPrefix(line, `(`)
if strings.Contains(line, "))") {
line = strings.TrimSuffix(line, `)`)
}
return line
}
func getLogicRelations(line string) []string {
rels := make([]string, 0)
splitted := strings.Split(line, "||")
for i, v := range splitted {
if strings.Contains(v, "&&") {
rels = append(rels, "&&")
}
if i < len(splitted)-1 {
rels = append(rels, "||")
}
}
if len(rels) == 0 {
rels = nil
}
return rels
}
func getHeuristics(line string) []*heuristic {
splitted := splitByLogicOps(line)
heuristics := make([]*heuristic, 0, len(splitted))
for _, v := range splitted {
v = strings.TrimSpace(v)
var reg string
if strings.Contains(v, ".match") {
reg = v[:strings.Index(v, ".match")]
reg = replaceRegexpVariables(reg)
}
if strings.Contains(v, ".include?") {
reg = includeToRegExp(v)
}
if strings.Contains(v, ".empty?") {
reg = emptyFile
}
if reg != "" {
reg = convertToValidRegexp(reg)
heuristics = append(heuristics, &heuristic{Regexp: reg})
}
}
return heuristics
}
func splitByLogicOps(line string) []string {
splitted := make([]string, 0, 1)
splitOr := strings.Split(line, "||")
for _, v := range splitOr {
splitAnd := strings.Split(v, "&&")
splitted = append(splitted, splitAnd...)
}
return splitted
}
func replaceRegexpVariables(reg string) string {
repl := reg
if v, ok := definedRegs[reg]; ok {
repl = v
}
return repl
}
func convertToValidRegexp(reg string) string {
// example: `/^(\s*)(<Project|<Import|<Property|<?xml|xmlns)/i``
// Ruby modifier "m" matches multiple lines, recognizing newlines as normal characters, Go use flag "s" for that.
const (
caseSensitive = "i"
matchEOL = "s"
rubyCaseSensitive = "i"
rubyMultiLine = "m"
)
if reg == emptyFile {
return reg
}
reg = strings.TrimPrefix(reg, `/`)
flags := "(?m"
lastSlash := strings.LastIndex(reg, `/`)
if lastSlash == -1 {
return flags + ")" + reg
}
specialChars := reg[lastSlash:]
reg = reg[:lastSlash]
if lastSlash == len(reg)-1 {
return flags + ")" + reg
}
if strings.Contains(specialChars, rubyCaseSensitive) {
flags = flags + caseSensitive
}
if strings.Contains(specialChars, rubyMultiLine) {
flags = flags + matchEOL
}
return flags + ")" + reg
}
func includeToRegExp(include string) string {
content := include[strings.Index(include, `(`)+1 : strings.Index(include, `)`)]
content = strings.Trim(content, `"'`)
return regex.QuoteMeta(content)
}
func getLanguages(line string) []string {
languages := make([]string, 0)
splitted := strings.Split(line, `,`)
for _, lang := range splitted {
lang = trimLanguage(lang)
languages = append(languages, lang)
}
return languages
}
func trimLanguage(enclosedLang string) string {
lang := strings.TrimSpace(enclosedLang)
lang = lang[strings.Index(lang, `"`)+1:]
lang = lang[:strings.Index(lang, `"`)]
return lang
}
func buildLanguagesHeuristics(langsList [][]string, heuristicsList [][]*heuristic, logicRelsList [][]string) []*languageHeuristics {
langsHeuristics := make([]*languageHeuristics, 0, len(langsList))
for i, langSlice := range langsList {
var heuristics []*heuristic
if i < len(heuristicsList) {
heuristics = heuristicsList[i]
}
var rels []string
if i < len(logicRelsList) {
rels = logicRelsList[i]
}
for _, lang := range langSlice {
lh := &languageHeuristics{
Language: lang,
Heuristics: heuristics,
LogicRelations: rels,
}
langsHeuristics = append(langsHeuristics, lh)
}
}
return langsHeuristics
}
func executeContentTemplate(out io.Writer, disambiguators []*disambiguator, tmplPath, tmplName, commit string) error {
fmap := template.FuncMap{
"getAllHeuristics": getAllHeuristics,
"returnStringSlice": func(slice []string) string {
if len(slice) == 0 {
return "nil"
}
return `[]string{` + strings.Join(slice, `, `) + `}`
},
"returnLanguages": returnLanguages,
"avoidLanguage": avoidLanguage,
}
return executeTemplate(out, tmplName, tmplPath, commit, fmap, disambiguators)
}
func getAllHeuristics(disambiguators []*disambiguator) []*heuristic {
heuristics := make([]*heuristic, 0)
for _, disamb := range disambiguators {
for _, lang := range disamb.Languages {
if !avoidLanguage(lang) {
heuristics = append(heuristics, lang.Heuristics...)
}
}
}
return heuristics
}
func avoidLanguage(lang *languageHeuristics) bool {
// necessary to avoid corner cases
for _, heuristic := range lang.Heuristics {
if containsInvalidRegexp(heuristic.Regexp) {
return true
}
}
return false
}
func containsInvalidRegexp(reg string) bool {
return strings.Contains(reg, `(?<`) || strings.Contains(reg, `\1`)
}
func returnLanguages(langsHeuristics []*languageHeuristics) []string {
langs := make([]string, 0)
for _, langHeu := range langsHeuristics {
if len(langHeu.Heuristics) == 0 {
langs = append(langs, `"`+langHeu.Language+`"`)
}
}
return langs
// convertToValidRegexp converts Ruby regexp syntaxt to RE2 equivalent.
// Does not work with Ruby regexp literals.
func convertToValidRegexp(rubyRegexp string) string {
return multilinePrefix + rubyRegexp
}

View File

@ -0,0 +1,123 @@
package generator
import (
"bytes"
"fmt"
"go/format"
"testing"
"text/template"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestYAMLParsing(t *testing.T) {
heuristics, err := parseYaml("test_files/heuristics.yml")
require.NoError(t, err)
assert.NotNil(t, heuristics)
// extensions
require.NotNil(t, heuristics.Disambiguations)
assert.Equal(t, 4, len(heuristics.Disambiguations))
assert.Equal(t, 2, len(heuristics.Disambiguations[0].Extensions))
rules := heuristics.Disambiguations[0].Rules
assert.Equal(t, 2, len(rules))
require.Equal(t, "Objective-C", rules[0].Languages[0])
assert.Equal(t, 1, len(rules[0].Pattern))
rules = heuristics.Disambiguations[1].Rules
assert.Equal(t, 3, len(rules))
require.Equal(t, "Forth", rules[0].Languages[0])
require.Equal(t, 2, len(rules[0].Pattern))
rules = heuristics.Disambiguations[2].Rules
assert.Equal(t, 3, len(rules))
require.Equal(t, "Unix Assembly", rules[1].Languages[0])
require.NotNil(t, rules[1].And)
assert.Equal(t, 2, len(rules[1].And))
require.NotNil(t, rules[1].And[0].NegativePattern)
assert.Equal(t, "np", rules[1].And[0].NegativePattern)
rules = heuristics.Disambiguations[3].Rules
assert.Equal(t, 1, len(rules))
assert.Equal(t, "Linux Kernel Module", rules[0].Languages[0])
assert.Equal(t, "AMPL", rules[0].Languages[1])
// named_patterns
require.NotNil(t, heuristics.NamedPatterns)
assert.Equal(t, 2, len(heuristics.NamedPatterns))
assert.Equal(t, 1, len(heuristics.NamedPatterns["fortran"]))
assert.Equal(t, 2, len(heuristics.NamedPatterns["cpp"]))
}
func TestSingleRuleLoading(t *testing.T) {
namedPatterns := map[string]StringArray{"cpp": []string{"cpp_ptrn1", "cpp_ptrn2"}}
rules := []*Rule{
&Rule{Languages: []string{"a"}, Patterns: Patterns{NamedPattern: "cpp"}},
&Rule{Languages: []string{"b"}, And: []*Rule{}},
}
// named_pattern case
langPattern := loadRule(namedPatterns, rules[0])
require.Equal(t, "a", langPattern.Langs[0])
assert.NotEmpty(t, langPattern.Pattern)
// and case
langPattern = loadRule(namedPatterns, rules[1])
require.Equal(t, "b", langPattern.Langs[0])
}
func TestLoadingAllHeuristics(t *testing.T) {
parsedYaml, err := parseYaml("test_files/heuristics.yml")
require.NoError(t, err)
hs, err := loadHeuristics(parsedYaml)
// grep -Eo "extensions:\ (.*)" internal/code-generator/generator/test_files/heuristics.yml
assert.Equal(t, 5, len(hs))
}
func TestLoadingHeuristicsForSameExt(t *testing.T) {
parsedYaml := &Heuristics{
Disambiguations: []*Disambiguation{
&Disambiguation{
Extensions: []string{".a", ".b"},
Rules: []*Rule{&Rule{Languages: []string{"A"}}},
},
&Disambiguation{
Extensions: []string{".b"},
Rules: []*Rule{&Rule{Languages: []string{"B"}}},
},
},
}
_, err := loadHeuristics(parsedYaml)
require.Error(t, err)
}
func TestTemplateMatcherVars(t *testing.T) {
parsed, err := parseYaml("test_files/heuristics.yml")
require.NoError(t, err)
heuristics, err := loadHeuristics(parsed)
require.NoError(t, err)
// render a tmpl
const contentTmpl = "../assets/content.go.tmpl"
tmpl, err := template.ParseFiles(contentTmpl)
require.NoError(t, err)
buf := bytes.NewBuffer(nil)
err = tmpl.Execute(buf, heuristics)
require.NoError(t, err, fmt.Sprintf("%+v", tmpl))
require.NotEmpty(t, buf)
// TODO(bzz) add more advanced test using go/ast package, to verify the
// strucutre of generated code:
// - check key literal exists in map for each extension:
src, err := format.Source(buf.Bytes())
require.NoError(t, err, "\n%s\n", string(src))
}

View File

@ -2,11 +2,14 @@ package generator
import (
"bytes"
"gopkg.in/yaml.v2"
"io"
"io/ioutil"
"gopkg.in/yaml.v2"
)
// MimeType generates a map in Go with language name -> MIME string.
// It is of generator.File type.
func MimeType(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string) error {
data, err := ioutil.ReadFile(fileToParse)
if err != nil {

View File

@ -1,23 +1,29 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
// LanguagesByAlias keeps alias for different languages and use the name of the languages as an alias too.
import "strings"
// LanguageByAliasMap keeps alias for different languages and use the name of the languages as an alias too.
// All the keys (alias or not) are written in lower case and the whitespaces has been replaced by underscores.
var LanguagesByAlias = map[string]string{
"1c_enterprise": "1C Enterprise",
"abap": "ABAP",
"abl": "OpenEdge ABL",
"abnf": "ABNF",
"abuild": "Alpine Abuild",
"aconf": "ApacheConf",
"actionscript": "ActionScript",
"actionscript3": "ActionScript",
"actionscript_3": "ActionScript",
"ada": "Ada",
"ada2005": "Ada",
"ada95": "Ada",
var LanguageByAliasMap = map[string]string{
"1c_enterprise": "1C Enterprise",
"abap": "ABAP",
"abl": "OpenEdge ABL",
"abnf": "ABNF",
"abuild": "Alpine Abuild",
"acfm": "Adobe Font Metrics",
"aconf": "ApacheConf",
"actionscript": "ActionScript",
"actionscript3": "ActionScript",
"actionscript_3": "ActionScript",
"ada": "Ada",
"ada2005": "Ada",
"ada95": "Ada",
"adobe_composite_font_metrics": "Adobe Font Metrics",
"adobe_font_metrics": "Adobe Font Metrics",
"adobe_multiple_font_metrics": "Adobe Font Metrics",
"advpl": "xBase",
"afdko": "OpenType Feature File",
"agda": "Agda",
@ -26,7 +32,9 @@ var LanguagesByAlias = map[string]string{
"ahk": "AutoHotkey",
"alloy": "Alloy",
"alpine_abuild": "Alpine Abuild",
"amfm": "Adobe Font Metrics",
"ampl": "AMPL",
"angelscript": "AngelScript",
"ant_build_system": "Ant Build System",
"antlr": "ANTLR",
"apache": "ApacheConf",
@ -35,78 +43,81 @@ var LanguagesByAlias = map[string]string{
"api_blueprint": "API Blueprint",
"apkbuild": "Alpine Abuild",
"apl": "APL",
"apollo_guidance_computer": "Apollo Guidance Computer",
"applescript": "AppleScript",
"arc": "Arc",
"arduino": "Arduino",
"arexx": "REXX",
"as3": "ActionScript",
"asciidoc": "AsciiDoc",
"asn.1": "ASN.1",
"asp": "ASP",
"aspectj": "AspectJ",
"aspx": "ASP",
"aspx-vb": "ASP",
"assembly": "Assembly",
"ats": "ATS",
"ats2": "ATS",
"au3": "AutoIt",
"augeas": "Augeas",
"autoconf": "M4Sugar",
"autohotkey": "AutoHotkey",
"autoit": "AutoIt",
"autoit3": "AutoIt",
"autoitscript": "AutoIt",
"awk": "Awk",
"b3d": "BlitzBasic",
"bash": "Shell",
"bash_session": "ShellSession",
"bat": "Batchfile",
"batch": "Batchfile",
"batchfile": "Batchfile",
"befunge": "Befunge",
"bison": "Bison",
"bitbake": "BitBake",
"blade": "Blade",
"blitz3d": "BlitzBasic",
"blitzbasic": "BlitzBasic",
"blitzmax": "BlitzMax",
"blitzplus": "BlitzBasic",
"bluespec": "Bluespec",
"bmax": "BlitzMax",
"boo": "Boo",
"bplus": "BlitzBasic",
"brainfuck": "Brainfuck",
"brightscript": "Brightscript",
"bro": "Bro",
"bsdmake": "Makefile",
"byond": "DM",
"c": "C",
"c#": "C#",
"c++": "C++",
"c++-objdump": "Cpp-ObjDump",
"c-objdump": "C-ObjDump",
"c2hs": "C2hs Haskell",
"c2hs_haskell": "C2hs Haskell",
"cap'n_proto": "Cap'n Proto",
"carto": "CartoCSS",
"cartocss": "CartoCSS",
"ceylon": "Ceylon",
"cfc": "ColdFusion CFC",
"cfm": "ColdFusion",
"cfml": "ColdFusion",
"chapel": "Chapel",
"charity": "Charity",
"chpl": "Chapel",
"chuck": "ChucK",
"cirru": "Cirru",
"clarion": "Clarion",
"clean": "Clean",
"click": "Click",
"clipper": "xBase",
"clips": "CLIPS",
"clojure": "Clojure",
"closure_templates": "Closure Templates",
"apollo_guidance_computer": "Apollo Guidance Computer",
"applescript": "AppleScript",
"arc": "Arc",
"arexx": "REXX",
"as3": "ActionScript",
"asciidoc": "AsciiDoc",
"asm": "Assembly",
"asn.1": "ASN.1",
"asp": "ASP",
"aspectj": "AspectJ",
"aspx": "ASP",
"aspx-vb": "ASP",
"assembly": "Assembly",
"asymptote": "Asymptote",
"ats": "ATS",
"ats2": "ATS",
"au3": "AutoIt",
"augeas": "Augeas",
"autoconf": "M4Sugar",
"autohotkey": "AutoHotkey",
"autoit": "AutoIt",
"autoit3": "AutoIt",
"autoitscript": "AutoIt",
"awk": "Awk",
"b3d": "BlitzBasic",
"ballerina": "Ballerina",
"bash": "Shell",
"bash_session": "ShellSession",
"bat": "Batchfile",
"batch": "Batchfile",
"batchfile": "Batchfile",
"befunge": "Befunge",
"bison": "Bison",
"bitbake": "BitBake",
"blade": "Blade",
"blitz3d": "BlitzBasic",
"blitzbasic": "BlitzBasic",
"blitzmax": "BlitzMax",
"blitzplus": "BlitzBasic",
"bluespec": "Bluespec",
"bmax": "BlitzMax",
"boo": "Boo",
"bplus": "BlitzBasic",
"brainfuck": "Brainfuck",
"brightscript": "Brightscript",
"bro": "Bro",
"bsdmake": "Makefile",
"byond": "DM",
"c": "C",
"c#": "C#",
"c++": "C++",
"c++-objdump": "Cpp-ObjDump",
"c-objdump": "C-ObjDump",
"c2hs": "C2hs Haskell",
"c2hs_haskell": "C2hs Haskell",
"cap'n_proto": "Cap'n Proto",
"carto": "CartoCSS",
"cartocss": "CartoCSS",
"ceylon": "Ceylon",
"cfc": "ColdFusion CFC",
"cfm": "ColdFusion",
"cfml": "ColdFusion",
"chapel": "Chapel",
"charity": "Charity",
"chpl": "Chapel",
"chuck": "ChucK",
"cirru": "Cirru",
"clarion": "Clarion",
"clean": "Clean",
"click": "Click",
"clipper": "xBase",
"clips": "CLIPS",
"clojure": "Clojure",
"closure_templates": "Closure Templates",
"cloud_firestore_security_rules": "Cloud Firestore Security Rules",
"cmake": "CMake",
"cobol": "COBOL",
"coffee": "CoffeeScript",
@ -117,10 +128,15 @@ var LanguagesByAlias = map[string]string{
"coldfusion_html": "ColdFusion",
"collada": "COLLADA",
"common_lisp": "Common Lisp",
"common_workflow_language": "Common Workflow Language",
"component_pascal": "Component Pascal",
"conll": "CoNLL-U",
"conll-u": "CoNLL-U",
"conll-x": "CoNLL-U",
"console": "ShellSession",
"cool": "Cool",
"coq": "Coq",
"cperl": "Perl",
"cpp": "C++",
"cpp-objdump": "Cpp-ObjDump",
"creole": "Creole",
@ -138,12 +154,14 @@ var LanguagesByAlias = map[string]string{
"cucumber": "Gherkin",
"cuda": "Cuda",
"cweb": "CWeb",
"cwl": "Common Workflow Language",
"cycript": "Cycript",
"cython": "Cython",
"d": "D",
"d-objdump": "D-ObjDump",
"darcs_patch": "Darcs Patch",
"dart": "Dart",
"dataweave": "DataWeave",
"dcl": "DIGITAL Command Language",
"delphi": "Component Pascal",
"desktop": "desktop",
@ -169,53 +187,69 @@ var LanguagesByAlias = map[string]string{
"ecl": "ECL",
"eclipse": "ECLiPSe",
"ecr": "HTML+ECR",
"edn": "edn",
"eex": "HTML+EEX",
"eiffel": "Eiffel",
"ejs": "EJS",
"elisp": "Emacs Lisp",
"elixir": "Elixir",
"elm": "Elm",
"emacs": "Emacs Lisp",
"emacs_lisp": "Emacs Lisp",
"emberscript": "EmberScript",
"eq": "EQ",
"erb": "HTML+ERB",
"erlang": "Erlang",
"f#": "F#",
"factor": "Factor",
"fancy": "Fancy",
"fantom": "Fantom",
"filebench_wml": "Filebench WML",
"filterscript": "Filterscript",
"fish": "fish",
"flex": "Lex",
"flux": "FLUX",
"formatted": "Formatted",
"forth": "Forth",
"fortran": "Fortran",
"foxpro": "xBase",
"freemarker": "FreeMarker",
"frege": "Frege",
"fsharp": "F#",
"ftl": "FreeMarker",
"fundamental": "Text",
"g-code": "G-code",
"game_maker_language": "Game Maker Language",
"gams": "GAMS",
"gap": "GAP",
"edje_data_collection": "Edje Data Collection",
"edn": "edn",
"eeschema_schematic": "KiCad Schematic",
"eex": "HTML+EEX",
"eiffel": "Eiffel",
"ejs": "EJS",
"elisp": "Emacs Lisp",
"elixir": "Elixir",
"elm": "Elm",
"emacs": "Emacs Lisp",
"emacs_lisp": "Emacs Lisp",
"emberscript": "EmberScript",
"eml": "EML",
"eq": "EQ",
"erb": "HTML+ERB",
"erlang": "Erlang",
"f#": "F#",
"f*": "F*",
"factor": "Factor",
"fancy": "Fancy",
"fantom": "Fantom",
"figfont": "FIGlet Font",
"figlet_font": "FIGlet Font",
"filebench_wml": "Filebench WML",
"filterscript": "Filterscript",
"fish": "fish",
"flex": "Lex",
"flux": "FLUX",
"formatted": "Formatted",
"forth": "Forth",
"fortran": "Fortran",
"foxpro": "xBase",
"freemarker": "FreeMarker",
"frege": "Frege",
"fsharp": "F#",
"fstar": "F*",
"ftl": "FreeMarker",
"fundamental": "Text",
"g-code": "G-code",
"game_maker_language": "Game Maker Language",
"gams": "GAMS",
"gap": "GAP",
"gcc_machine_description": "GCC Machine Description",
"gdb": "GDB",
"gdscript": "GDScript",
"genie": "Genie",
"genshi": "Genshi",
"gentoo_ebuild": "Gentoo Ebuild",
"gentoo_eclass": "Gentoo Eclass",
"gettext_catalog": "Gettext Catalog",
"gf": "Grammatical Framework",
"gherkin": "Gherkin",
"glsl": "GLSL",
"glyph": "Glyph",
"gdb": "GDB",
"gdscript": "GDScript",
"genie": "Genie",
"genshi": "Genshi",
"gentoo_ebuild": "Gentoo Ebuild",
"gentoo_eclass": "Gentoo Eclass",
"gerber_image": "Gerber Image",
"gettext_catalog": "Gettext Catalog",
"gf": "Grammatical Framework",
"gherkin": "Gherkin",
"git-ignore": "Ignore List",
"git_attributes": "Git Attributes",
"git_config": "Git Config",
"gitattributes": "Git Attributes",
"gitconfig": "Git Config",
"gitignore": "Ignore List",
"gitmodules": "Git Config",
"glsl": "GLSL",
"glyph": "Glyph",
"glyph_bitmap_distribution_format": "Glyph Bitmap Distribution Format",
"gn": "GN",
"gnuplot": "Gnuplot",
"go": "Go",
@ -228,17 +262,20 @@ var LanguagesByAlias = map[string]string{
"graph_modeling_language": "Graph Modeling Language",
"graphql": "GraphQL",
"graphviz_(dot)": "Graphviz (DOT)",
"groff": "Roff",
"groovy": "Groovy",
"groovy_server_pages": "Groovy Server Pages",
"gsp": "Groovy Server Pages",
"hack": "Hack",
"haml": "Haml",
"handlebars": "Handlebars",
"haproxy": "HAProxy",
"harbour": "Harbour",
"haskell": "Haskell",
"haxe": "Haxe",
"hbs": "Handlebars",
"hcl": "HCL",
"hiveql": "HiveQL",
"hlsl": "HLSL",
"html": "HTML",
"html+django": "HTML+Django",
@ -248,16 +285,20 @@ var LanguagesByAlias = map[string]string{
"html+erb": "HTML+ERB",
"html+jinja": "HTML+Django",
"html+php": "HTML+PHP",
"html+razor": "HTML+Razor",
"html+ruby": "RHTML",
"htmlbars": "Handlebars",
"htmldjango": "HTML+Django",
"http": "HTTP",
"hxml": "HXML",
"hy": "Hy",
"hylang": "Hy",
"hyphy": "HyPhy",
"i7": "Inform 7",
"idl": "IDL",
"idris": "Idris",
"ignore": "Ignore List",
"ignore_list": "Ignore List",
"igor": "IGOR Pro",
"igor_pro": "IGOR Pro",
"igorpro": "IGOR Pro",
@ -277,6 +318,7 @@ var LanguagesByAlias = map[string]string{
"j": "J",
"jasmin": "Jasmin",
"java": "Java",
"java_properties": "Java Properties",
"java_server_page": "Groovy Server Pages",
"java_server_pages": "Java Server Pages",
"javascript": "JavaScript",
@ -288,13 +330,17 @@ var LanguagesByAlias = map[string]string{
"js": "JavaScript",
"json": "JSON",
"json5": "JSON5",
"json_with_comments": "JSON with Comments",
"jsonc": "JSON with Comments",
"jsoniq": "JSONiq",
"jsonld": "JSONLD",
"jsp": "Java Server Pages",
"jsx": "JSX",
"julia": "Julia",
"jupyter_notebook": "Jupyter Notebook",
"kicad": "KiCad",
"kicad_layout": "KiCad Layout",
"kicad_legacy_layout": "KiCad Legacy Layout",
"kicad_schematic": "KiCad Schematic",
"kit": "Kit",
"kotlin": "Kotlin",
"krl": "KRL",
@ -329,6 +375,7 @@ var LanguagesByAlias = map[string]string{
"loomscript": "LoomScript",
"ls": "LiveScript",
"lsl": "LSL",
"ltspice_symbol": "LTspice Symbol",
"lua": "Lua",
"m": "M",
"m4": "M4",
@ -337,17 +384,22 @@ var LanguagesByAlias = map[string]string{
"make": "Makefile",
"makefile": "Makefile",
"mako": "Mako",
"man": "Roff",
"man-page": "Roff",
"man_page": "Roff",
"manpage": "Roff",
"markdown": "Markdown",
"marko": "Marko",
"markojs": "Marko",
"mask": "Mask",
"mathematica": "Mathematica",
"matlab": "Matlab",
"matlab": "MATLAB",
"maven_pom": "Maven POM",
"max": "Max",
"max/msp": "Max",
"maxmsp": "Max",
"maxscript": "MAXScript",
"mdoc": "Roff",
"mediawiki": "MediaWiki",
"mercury": "Mercury",
"meson": "Meson",
@ -358,6 +410,7 @@ var LanguagesByAlias = map[string]string{
"mma": "Mathematica",
"modelica": "Modelica",
"modula-2": "Modula-2",
"modula-3": "Modula-3",
"module_management_system": "Module Management System",
"monkey": "Monkey",
"moocode": "Moocode",
@ -369,14 +422,17 @@ var LanguagesByAlias = map[string]string{
"mumps": "M",
"mupad": "mupad",
"myghty": "Myghty",
"nanorc": "nanorc",
"nasm": "Assembly",
"ncl": "NCL",
"nearley": "Nearley",
"nemerle": "Nemerle",
"nesc": "nesC",
"netlinx": "NetLinx",
"netlinx+erb": "NetLinx+ERB",
"netlogo": "NetLogo",
"newlisp": "NewLisp",
"nextflow": "Nextflow",
"nginx": "Nginx",
"nginx_configuration_file": "Nginx",
"nim": "Nim",
@ -409,8 +465,9 @@ var LanguagesByAlias = map[string]string{
"objectpascal": "Component Pascal",
"objj": "Objective-J",
"ocaml": "OCaml",
"octave": "Matlab",
"octave": "MATLAB",
"omgrofl": "Omgrofl",
"oncrpc": "RPC",
"ooc": "ooc",
"opa": "Opa",
"opal": "Opal",
@ -433,216 +490,270 @@ var LanguagesByAlias = map[string]string{
"parrot": "Parrot",
"parrot_assembly": "Parrot Assembly",
"parrot_internal_representation": "Parrot Internal Representation",
"pascal": "Pascal",
"pasm": "Parrot Assembly",
"pawn": "PAWN",
"pep8": "Pep8",
"perl": "Perl",
"perl_6": "Perl 6",
"php": "PHP",
"pic": "Pic",
"pickle": "Pickle",
"picolisp": "PicoLisp",
"piglatin": "PigLatin",
"pike": "Pike",
"pir": "Parrot Internal Representation",
"plpgsql": "PLpgSQL",
"plsql": "PLSQL",
"pod": "Pod",
"pogoscript": "PogoScript",
"pony": "Pony",
"posh": "PowerShell",
"postscr": "PostScript",
"postscript": "PostScript",
"pot": "Gettext Catalog",
"pov-ray": "POV-Ray SDL",
"pov-ray_sdl": "POV-Ray SDL",
"povray": "POV-Ray SDL",
"powerbuilder": "PowerBuilder",
"powershell": "PowerShell",
"processing": "Processing",
"progress": "OpenEdge ABL",
"prolog": "Prolog",
"propeller_spin": "Propeller Spin",
"protobuf": "Protocol Buffer",
"protocol_buffer": "Protocol Buffer",
"protocol_buffers": "Protocol Buffer",
"public_key": "Public Key",
"pug": "Pug",
"puppet": "Puppet",
"pure_data": "Pure Data",
"purebasic": "PureBasic",
"purescript": "PureScript",
"pycon": "Python console",
"pyrex": "Cython",
"python": "Python",
"python_console": "Python console",
"python_traceback": "Python traceback",
"qmake": "QMake",
"qml": "QML",
"r": "R",
"racket": "Racket",
"ragel": "Ragel",
"ragel-rb": "Ragel",
"ragel-ruby": "Ragel",
"rake": "Ruby",
"raml": "RAML",
"rascal": "Rascal",
"raw": "Raw token data",
"raw_token_data": "Raw token data",
"rb": "Ruby",
"rbx": "Ruby",
"rdoc": "RDoc",
"realbasic": "REALbasic",
"reason": "Reason",
"rebol": "Rebol",
"red": "Red",
"red/system": "Red",
"redcode": "Redcode",
"regex": "Regular Expression",
"regexp": "Regular Expression",
"regular_expression": "Regular Expression",
"ren'py": "Ren'Py",
"renderscript": "RenderScript",
"renpy": "Ren'Py",
"restructuredtext": "reStructuredText",
"rexx": "REXX",
"rhtml": "RHTML",
"ring": "Ring",
"rmarkdown": "RMarkdown",
"robotframework": "RobotFramework",
"roff": "Roff",
"rouge": "Rouge",
"rpm_spec": "RPM Spec",
"rscript": "R",
"rss": "XML",
"rst": "reStructuredText",
"ruby": "Ruby",
"runoff": "RUNOFF",
"rust": "Rust",
"rusthon": "Python",
"sage": "Sage",
"salt": "SaltStack",
"saltstack": "SaltStack",
"saltstate": "SaltStack",
"sas": "SAS",
"sass": "Sass",
"scala": "Scala",
"scaml": "Scaml",
"scheme": "Scheme",
"scilab": "Scilab",
"scss": "SCSS",
"self": "Self",
"sh": "Shell",
"shaderlab": "ShaderLab",
"shell": "Shell",
"shell-script": "Shell",
"shellsession": "ShellSession",
"shen": "Shen",
"slash": "Slash",
"slim": "Slim",
"smali": "Smali",
"smalltalk": "Smalltalk",
"smarty": "Smarty",
"sml": "Standard ML",
"smt": "SMT",
"sourcemod": "SourcePawn",
"sourcepawn": "SourcePawn",
"sparql": "SPARQL",
"specfile": "RPM Spec",
"spline_font_database": "Spline Font Database",
"splus": "R",
"sqf": "SQF",
"sql": "SQL",
"sqlpl": "SQLPL",
"squeak": "Smalltalk",
"squirrel": "Squirrel",
"srecode_template": "SRecode Template",
"stan": "Stan",
"standard_ml": "Standard ML",
"stata": "Stata",
"ston": "STON",
"stylus": "Stylus",
"sublime_text_config": "Sublime Text Config",
"subrip_text": "SubRip Text",
"supercollider": "SuperCollider",
"svg": "SVG",
"swift": "Swift",
"systemverilog": "SystemVerilog",
"tcl": "Tcl",
"tcsh": "Tcsh",
"tea": "Tea",
"terra": "Terra",
"tex": "TeX",
"text": "Text",
"textile": "Textile",
"thrift": "Thrift",
"ti_program": "TI Program",
"tl": "Type Language",
"tla": "TLA",
"toml": "TOML",
"ts": "TypeScript",
"turing": "Turing",
"turtle": "Turtle",
"twig": "Twig",
"txl": "TXL",
"type_language": "Type Language",
"typescript": "TypeScript",
"udiff": "Diff",
"unified_parallel_c": "Unified Parallel C",
"unity3d_asset": "Unity3D Asset",
"unix_assembly": "Unix Assembly",
"uno": "Uno",
"unrealscript": "UnrealScript",
"ur": "UrWeb",
"ur/web": "UrWeb",
"urweb": "UrWeb",
"vala": "Vala",
"vb.net": "Visual Basic",
"vbnet": "Visual Basic",
"vcl": "VCL",
"verilog": "Verilog",
"vhdl": "VHDL",
"vim": "Vim script",
"vim_script": "Vim script",
"viml": "Vim script",
"visual_basic": "Visual Basic",
"volt": "Volt",
"vue": "Vue",
"wasm": "WebAssembly",
"wast": "WebAssembly",
"wavefront_material": "Wavefront Material",
"wavefront_object": "Wavefront Object",
"web_ontology_language": "Web Ontology Language",
"webassembly": "WebAssembly",
"webidl": "WebIDL",
"winbatch": "Batchfile",
"wisp": "wisp",
"pascal": "Pascal",
"pasm": "Parrot Assembly",
"pawn": "Pawn",
"pcbnew": "KiCad Layout",
"pep8": "Pep8",
"perl": "Perl",
"perl6": "Perl 6",
"perl_6": "Perl 6",
"php": "PHP",
"pic": "Pic",
"pickle": "Pickle",
"picolisp": "PicoLisp",
"piglatin": "PigLatin",
"pike": "Pike",
"pir": "Parrot Internal Representation",
"plpgsql": "PLpgSQL",
"plsql": "PLSQL",
"pod": "Pod",
"pod_6": "Pod 6",
"pogoscript": "PogoScript",
"pony": "Pony",
"posh": "PowerShell",
"postcss": "PostCSS",
"postscr": "PostScript",
"postscript": "PostScript",
"pot": "Gettext Catalog",
"pov-ray": "POV-Ray SDL",
"pov-ray_sdl": "POV-Ray SDL",
"povray": "POV-Ray SDL",
"powerbuilder": "PowerBuilder",
"powershell": "PowerShell",
"processing": "Processing",
"progress": "OpenEdge ABL",
"prolog": "Prolog",
"propeller_spin": "Propeller Spin",
"protobuf": "Protocol Buffer",
"protocol_buffer": "Protocol Buffer",
"protocol_buffers": "Protocol Buffer",
"public_key": "Public Key",
"pug": "Pug",
"puppet": "Puppet",
"pure_data": "Pure Data",
"purebasic": "PureBasic",
"purescript": "PureScript",
"pwsh": "PowerShell",
"pycon": "Python console",
"pyrex": "Cython",
"python": "Python",
"python3": "Python",
"python_console": "Python console",
"python_traceback": "Python traceback",
"q": "q",
"qmake": "QMake",
"qml": "QML",
"quake": "Quake",
"r": "R",
"racket": "Racket",
"ragel": "Ragel",
"ragel-rb": "Ragel",
"ragel-ruby": "Ragel",
"rake": "Ruby",
"raml": "RAML",
"rascal": "Rascal",
"raw": "Raw token data",
"raw_token_data": "Raw token data",
"razor": "HTML+Razor",
"rb": "Ruby",
"rbx": "Ruby",
"rdoc": "RDoc",
"realbasic": "REALbasic",
"reason": "Reason",
"rebol": "Rebol",
"red": "Red",
"red/system": "Red",
"redcode": "Redcode",
"regex": "Regular Expression",
"regexp": "Regular Expression",
"regular_expression": "Regular Expression",
"ren'py": "Ren'Py",
"renderscript": "RenderScript",
"renpy": "Ren'Py",
"restructuredtext": "reStructuredText",
"rexx": "REXX",
"rhtml": "RHTML",
"ring": "Ring",
"rmarkdown": "RMarkdown",
"robotframework": "RobotFramework",
"roff": "Roff",
"roff_manpage": "Roff Manpage",
"rouge": "Rouge",
"rpc": "RPC",
"rpcgen": "RPC",
"rpm_spec": "RPM Spec",
"rs-274x": "Gerber Image",
"rscript": "R",
"rss": "XML",
"rst": "reStructuredText",
"ruby": "Ruby",
"runoff": "RUNOFF",
"rust": "Rust",
"rusthon": "Python",
"sage": "Sage",
"salt": "SaltStack",
"saltstack": "SaltStack",
"saltstate": "SaltStack",
"sas": "SAS",
"sass": "Sass",
"scala": "Scala",
"scaml": "Scaml",
"scheme": "Scheme",
"scilab": "Scilab",
"scss": "SCSS",
"sed": "sed",
"self": "Self",
"sh": "Shell",
"shaderlab": "ShaderLab",
"shell": "Shell",
"shell-script": "Shell",
"shellsession": "ShellSession",
"shen": "Shen",
"slash": "Slash",
"slice": "Slice",
"slim": "Slim",
"smali": "Smali",
"smalltalk": "Smalltalk",
"smarty": "Smarty",
"sml": "Standard ML",
"smt": "SMT",
"snippet": "YASnippet",
"solidity": "Solidity",
"sourcemod": "SourcePawn",
"sourcepawn": "SourcePawn",
"soy": "Closure Templates",
"sparql": "SPARQL",
"specfile": "RPM Spec",
"spline_font_database": "Spline Font Database",
"splus": "R",
"sqf": "SQF",
"sql": "SQL",
"sqlpl": "SQLPL",
"squeak": "Smalltalk",
"squirrel": "Squirrel",
"srecode_template": "SRecode Template",
"stan": "Stan",
"standard_ml": "Standard ML",
"stata": "Stata",
"ston": "STON",
"stylus": "Stylus",
"subrip_text": "SubRip Text",
"sugarss": "SugarSS",
"supercollider": "SuperCollider",
"svg": "SVG",
"swift": "Swift",
"systemverilog": "SystemVerilog",
"tcl": "Tcl",
"tcsh": "Tcsh",
"tea": "Tea",
"terra": "Terra",
"terraform": "HCL",
"tex": "TeX",
"text": "Text",
"textile": "Textile",
"thrift": "Thrift",
"ti_program": "TI Program",
"tl": "Type Language",
"tla": "TLA",
"toml": "TOML",
"troff": "Roff",
"ts": "TypeScript",
"turing": "Turing",
"turtle": "Turtle",
"twig": "Twig",
"txl": "TXL",
"type_language": "Type Language",
"typescript": "TypeScript",
"udiff": "Diff",
"unified_parallel_c": "Unified Parallel C",
"unity3d_asset": "Unity3D Asset",
"unix_assembly": "Unix Assembly",
"uno": "Uno",
"unrealscript": "UnrealScript",
"ur": "UrWeb",
"ur/web": "UrWeb",
"urweb": "UrWeb",
"vala": "Vala",
"vb.net": "Visual Basic",
"vbnet": "Visual Basic",
"vcl": "VCL",
"verilog": "Verilog",
"vhdl": "VHDL",
"vim": "Vim script",
"vim_script": "Vim script",
"viml": "Vim script",
"visual_basic": "Visual Basic",
"volt": "Volt",
"vue": "Vue",
"wasm": "WebAssembly",
"wast": "WebAssembly",
"wavefront_material": "Wavefront Material",
"wavefront_object": "Wavefront Object",
"wdl": "wdl",
"web_ontology_language": "Web Ontology Language",
"webassembly": "WebAssembly",
"webidl": "WebIDL",
"winbatch": "Batchfile",
"windows_registry_entries": "Windows Registry Entries",
"wisp": "wisp",
"world_of_warcraft_addon_data": "World of Warcraft Addon Data",
"wsdl": "XML",
"x10": "X10",
"xbase": "xBase",
"xc": "XC",
"xcompose": "XCompose",
"xhtml": "HTML",
"xml": "XML",
"xml+genshi": "Genshi",
"xml+kid": "Genshi",
"xojo": "Xojo",
"xpages": "XPages",
"xproc": "XProc",
"xquery": "XQuery",
"xs": "XS",
"xsd": "XML",
"xsl": "XSLT",
"xslt": "XSLT",
"xten": "X10",
"xtend": "Xtend",
"yacc": "Yacc",
"yaml": "YAML",
"yang": "YANG",
"yml": "YAML",
"zephir": "Zephir",
"zimpl": "Zimpl",
"zsh": "Shell",
"wsdl": "XML",
"x10": "X10",
"x_bitmap": "X BitMap",
"x_font_directory_index": "X Font Directory Index",
"x_pixmap": "X PixMap",
"xbase": "xBase",
"xbm": "X BitMap",
"xc": "XC",
"xcompose": "XCompose",
"xdr": "RPC",
"xhtml": "HTML",
"xml": "XML",
"xml+genshi": "Genshi",
"xml+kid": "Genshi",
"xojo": "Xojo",
"xpages": "XPages",
"xpm": "X PixMap",
"xproc": "XProc",
"xquery": "XQuery",
"xs": "XS",
"xsd": "XML",
"xsl": "XSLT",
"xslt": "XSLT",
"xten": "X10",
"xtend": "Xtend",
"yacc": "Yacc",
"yaml": "YAML",
"yang": "YANG",
"yara": "YARA",
"yas": "YASnippet",
"yasnippet": "YASnippet",
"yml": "YAML",
"zephir": "Zephir",
"zig": "Zig",
"zimpl": "Zimpl",
"zsh": "Shell",
}
// LanguageByAlias looks up the language name by it's alias or name.
// It mirrors the logic of github linguist and is needed e.g for heuristcs.yml
// that mixes names and aliases in a language field (see XPM example).
func LanguageByAlias(langOrAlias string) (lang string, ok bool) {
k := convertToAliasKey(langOrAlias)
lang, ok = LanguageByAliasMap[k]
return
}
// convertToAliasKey converts language name to a key in LanguageByAliasMap.
// Following
// - internal.code-generator.generator.convertToAliasKey()
// - GetLanguageByAlias()
// conventions.
// It is here to avoid dependency on "generate" and "enry" packages.
func convertToAliasKey(langName string) string {
ak := strings.SplitN(langName, `,`, 2)[0]
ak = strings.Replace(ak, ` `, `_`, -1)
ak = strings.ToLower(ak)
return ak
}

View File

@ -1,7 +1,7 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
// linguist's commit from which files were generated.
var LinguistCommit = "d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68"
var LinguistCommit = "e4560984058b4726010ca4b8f03ed9d0f8f464db"

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -8,10 +8,12 @@ import "gopkg.in/toqueteos/substring.v1"
var DocumentationMatchers = substring.Or(
substring.Regexp(`^[Dd]ocs?/`),
substring.Regexp(`(^|/)[Dd]ocumentation/`),
substring.Regexp(`(^|/)[Gg]roovydoc/`),
substring.Regexp(`(^|/)[Jj]avadoc/`),
substring.Regexp(`^[Mm]an/`),
substring.Regexp(`^[Ee]xamples/`),
substring.Regexp(`^[Dd]emos?/`),
substring.Regexp(`(^|/)inst/doc/`),
substring.Regexp(`(^|/)CHANGE(S|LOG)?(\.|$)`),
substring.Regexp(`(^|/)CONTRIBUTING(\.|$)`),
substring.Regexp(`(^|/)COPYING(\.|$)`),

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -8,40 +8,65 @@ var LanguagesByFilename = map[string][]string{
".XCompose": {"XCompose"},
".abbrev_defs": {"Emacs Lisp"},
".arcconfig": {"JSON"},
".babelrc": {"JSON5"},
".atomignore": {"Ignore List"},
".babelignore": {"Ignore List"},
".babelrc": {"JSON with Comments"},
".bash_aliases": {"Shell"},
".bash_history": {"Shell"},
".bash_logout": {"Shell"},
".bash_profile": {"Shell"},
".bashrc": {"Shell"},
".bzrignore": {"Ignore List"},
".clang-format": {"YAML"},
".clang-tidy": {"YAML"},
".classpath": {"XML"},
".coffeelintignore": {"Ignore List"},
".cproject": {"XML"},
".cshrc": {"Shell"},
".cvsignore": {"Ignore List"},
".dockerignore": {"Ignore List"},
".editorconfig": {"INI"},
".emacs": {"Emacs Lisp"},
".emacs.desktop": {"Emacs Lisp"},
".eslintignore": {"Ignore List"},
".eslintrc.json": {"JSON with Comments"},
".factor-boot-rc": {"Factor"},
".factor-rc": {"Factor"},
".gclient": {"Python"},
".gemrc": {"YAML"},
".gitconfig": {"INI"},
".gitattributes": {"Git Attributes"},
".gitconfig": {"Git Config"},
".gitignore": {"Ignore List"},
".gitmodules": {"Git Config"},
".gn": {"GN"},
".gnus": {"Emacs Lisp"},
".gvimrc": {"Vim script"},
".htaccess": {"ApacheConf"},
".htmlhintrc": {"JSON"},
".irbrc": {"Ruby"},
".jshintrc": {"JSON"},
".jscsrc": {"JSON with Comments"},
".jshintrc": {"JSON with Comments"},
".jslintrc": {"JSON with Comments"},
".login": {"Shell"},
".nanorc": {"nanorc"},
".nodemonignore": {"Ignore List"},
".npmignore": {"Ignore List"},
".nvimrc": {"Vim script"},
".php": {"PHP"},
".php_cs": {"PHP"},
".php_cs.dist": {"PHP"},
".prettierignore": {"Ignore List"},
".profile": {"Shell"},
".project": {"XML"},
".pryrc": {"Ruby"},
".spacemacs": {"Emacs Lisp"},
".stylelintignore": {"Ignore List"},
".tern-config": {"JSON"},
".tern-project": {"JSON"},
".vimrc": {"Vim script"},
".viper": {"Emacs Lisp"},
".vscodeignore": {"Ignore List"},
".watchmanconfig": {"JSON"},
".zlogin": {"Shell"},
".zlogout": {"Shell"},
".zprofile": {"Shell"},
@ -54,6 +79,7 @@ var LanguagesByFilename = map[string][]string{
"BSDmakefile": {"Makefile"},
"BUCK": {"Python"},
"BUILD": {"Python"},
"BUILD.bazel": {"Python"},
"Berksfile": {"Ruby"},
"Brewfile": {"Ruby"},
"Buildfile": {"Ruby"},
@ -63,6 +89,7 @@ var LanguagesByFilename = map[string][]string{
"COPYRIGHT.regex": {"Text"},
"Cakefile": {"CoffeeScript"},
"Capfile": {"Ruby"},
"Cargo.lock": {"TOML"},
"Cask": {"Emacs Lisp"},
"Dangerfile": {"Ruby"},
"Deliverfile": {"Ruby"},
@ -74,6 +101,7 @@ var LanguagesByFilename = map[string][]string{
"GNUmakefile": {"Makefile"},
"Gemfile": {"Ruby"},
"Gemfile.lock": {"Ruby"},
"Gopkg.lock": {"TOML"},
"Guardfile": {"Ruby"},
"INSTALL": {"Text"},
"INSTALL.mysql": {"Text"},
@ -84,6 +112,7 @@ var LanguagesByFilename = map[string][]string{
"LICENSE": {"Text"},
"LICENSE.mysql": {"Text"},
"Makefile": {"Makefile"},
"Makefile.PL": {"Perl"},
"Makefile.am": {"Makefile"},
"Makefile.boot": {"Makefile"},
"Makefile.frag": {"Makefile"},
@ -106,7 +135,7 @@ var LanguagesByFilename = map[string][]string{
"README.mysql": {"Text"},
"ROOT": {"Isabelle ROOT"},
"Rakefile": {"Ruby"},
"Rexfile": {"Perl 6"},
"Rexfile": {"Perl"},
"SConscript": {"Python"},
"SConstruct": {"Python"},
"Settings.StyleCop": {"XML"},
@ -126,25 +155,44 @@ var LanguagesByFilename = map[string][]string{
"ack": {"Perl"},
"ant.xml": {"Ant Build System"},
"apache2.conf": {"ApacheConf"},
"bash_aliases": {"Shell"},
"bash_logout": {"Shell"},
"bash_profile": {"Shell"},
"bashrc": {"Shell"},
"build.xml": {"Ant Build System"},
"buildfile": {"Ruby"},
"buildozer.spec": {"INI"},
"click.me": {"Text"},
"composer.lock": {"JSON"},
"configure.ac": {"M4Sugar"},
"contents.lr": {"Markdown"},
"cpanfile": {"Perl"},
"cshrc": {"Shell"},
"delete.me": {"Text"},
"descrip.mmk": {"Module Management System"},
"descrip.mms": {"Module Management System"},
"encodings.dir": {"X Font Directory Index"},
"expr-dist": {"R"},
"firestore.rules": {"Cloud Firestore Security Rules"},
"fonts.alias": {"X Font Directory Index"},
"fonts.dir": {"X Font Directory Index"},
"fonts.scale": {"X Font Directory Index"},
"fp-lib-table": {"KiCad Layout"},
"gitignore-global": {"Ignore List"},
"gitignore_global": {"Ignore List"},
"glide.lock": {"YAML"},
"go.mod": {"Text"},
"go.sum": {"Text"},
"gradlew": {"Shell"},
"gvimrc": {"Vim script"},
"haproxy.cfg": {"HAProxy"},
"httpd.conf": {"ApacheConf"},
"jsconfig.json": {"JSON with Comments"},
"keep.me": {"Text"},
"ld.script": {"Linker Script"},
"login": {"Shell"},
"m3makefile": {"Quake"},
"m3overrides": {"Quake"},
"makefile": {"Makefile"},
"makefile.sco": {"Makefile"},
"man": {"Shell"},
@ -155,7 +203,10 @@ var LanguagesByFilename = map[string][]string{
"mkfile": {"Makefile"},
"mmn": {"Roff"},
"mmt": {"Roff"},
"nanorc": {"nanorc"},
"nextflow.config": {"Nextflow"},
"nginx.conf": {"Nginx"},
"nim.cfg": {"Nim"},
"nvimrc": {"Vim script"},
"owh": {"Tcl"},
"packages.config": {"XML"},
@ -167,9 +218,9 @@ var LanguagesByFilename = map[string][]string{
"rebar.config.lock": {"Erlang"},
"rebar.lock": {"Erlang"},
"riemann.config": {"Clojure"},
"script": {"C"},
"starfield": {"Tcl"},
"test.me": {"Text"},
"tsconfig.json": {"JSON with Comments"},
"vimrc": {"Vim script"},
"wscript": {"Python"},
"xcompose": {"XCompose"},

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,40 @@
# Tests care about number and order of heuristics in this fixture
disambiguations:
- extensions: ['.h', '.hh']
rules:
- language: Objective-C
pattern: 'objc'
- language: C++
named_pattern: cpp
- extensions: ['.f']
rules:
- language: Forth
pattern: #as in .md
- 'f'
- 'f1'
- language: Filebench WML
pattern: 'f2'
- language: Fortran
named_pattern: fortran
- extensions: ['.ms']
rules:
- language: Roff
pattern: 'rp'
- language: Unix Assembly
and:
- negative_pattern: 'np'
- pattern: 'p'
- language: MAXScript
- extensions: ['.mod']
rules:
- language: [Linux Kernel Module, AMPL]
named_patterns:
cpp:
- 'regex1'
- 'regex2'
fortran: 'regex3'

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -7,6 +7,8 @@ var LanguagesByInterpreter = map[string][]string{
"Rscript": {"R"},
"apl": {"APL"},
"aplx": {"APL"},
"ash": {"Shell"},
"asy": {"Asymptote"},
"awk": {"Awk"},
"bash": {"Shell"},
"bigloo": {"Scheme"},
@ -15,10 +17,13 @@ var LanguagesByInterpreter = map[string][]string{
"chicken": {"Scheme"},
"clisp": {"Common Lisp"},
"coffee": {"CoffeeScript"},
"cperl": {"Perl"},
"crystal": {"Crystal"},
"csi": {"Scheme"},
"cvc4": {"SMT"},
"cwl-runner": {"Common Workflow Language"},
"dart": {"Dart"},
"dash": {"Shell"},
"dtrace": {"DTrace"},
"dyalog": {"APL"},
"ecl": {"Common Lisp"},
@ -26,11 +31,15 @@ var LanguagesByInterpreter = map[string][]string{
"escript": {"Erlang"},
"fish": {"fish"},
"gawk": {"Awk"},
"gerbv": {"Gerber Image"},
"gerbview": {"Gerber Image"},
"gn": {"GN"},
"gnuplot": {"Gnuplot"},
"gosh": {"Scheme"},
"groovy": {"Groovy"},
"gsed": {"sed"},
"guile": {"Scheme"},
"hy": {"Hy"},
"instantfpc": {"Pascal"},
"io": {"Io"},
"ioke": {"Ioke"},
@ -38,6 +47,7 @@ var LanguagesByInterpreter = map[string][]string{
"jolie": {"Jolie"},
"jruby": {"Ruby"},
"julia": {"Julia"},
"ksh": {"Shell"},
"lisp": {"Common Lisp"},
"lsl": {"LSL"},
"lua": {"Lua", "Terra"},
@ -45,11 +55,14 @@ var LanguagesByInterpreter = map[string][]string{
"make": {"Makefile"},
"mathsat5": {"SMT"},
"mawk": {"Awk"},
"minised": {"sed"},
"mksh": {"Shell"},
"mmi": {"Mercury"},
"moon": {"MoonScript"},
"nawk": {"Awk"},
"newlisp": {"NewLisp"},
"node": {"JavaScript"},
"nextflow": {"Nextflow"},
"node": {"JavaScript", "TypeScript"},
"nush": {"Nu"},
"ocaml": {"OCaml", "Reason"},
"ocamlrun": {"OCaml"},
@ -58,12 +71,14 @@ var LanguagesByInterpreter = map[string][]string{
"opensmt": {"SMT"},
"osascript": {"AppleScript"},
"parrot": {"Parrot Assembly", "Parrot Internal Representation"},
"perl": {"Perl"},
"perl6": {"Perl 6"},
"pdksh": {"Shell"},
"perl": {"Perl", "Pod"},
"perl6": {"Perl 6", "Pod 6"},
"php": {"PHP"},
"picolisp": {"PicoLisp"},
"pike": {"Pike"},
"pil": {"PicoLisp"},
"pwsh": {"PowerShell"},
"python": {"Python"},
"python2": {"Python"},
"python3": {"Python"},
@ -80,11 +95,14 @@ var LanguagesByInterpreter = map[string][]string{
"runhaskell": {"Haskell"},
"sbcl": {"Common Lisp"},
"scala": {"Scala"},
"scheme": {"Scheme"},
"sclang": {"SuperCollider"},
"scsynth": {"SuperCollider"},
"sed": {"sed"},
"sh": {"Shell"},
"smt-rat": {"SMT"},
"smtinterpol": {"SMT"},
"ssed": {"sed"},
"stp": {"SMT"},
"swipl": {"Prolog"},
"tcc": {"C"},

View File

@ -1,203 +1,219 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
var LanguagesMime = map[string]string{
"AGS Script": "text/x-c++src",
"APL": "text/apl",
"ASN.1": "text/x-ttcn-asn",
"ASP": "application/x-aspx",
"Alpine Abuild": "text/x-sh",
"Ant Build System": "application/xml",
"Apex": "text/x-java",
"Arduino": "text/x-c++src",
"Brainfuck": "text/x-brainfuck",
"C": "text/x-csrc",
"C#": "text/x-csharp",
"C++": "text/x-c++src",
"C2hs Haskell": "text/x-haskell",
"CMake": "text/x-cmake",
"COBOL": "text/x-cobol",
"COLLADA": "text/xml",
"CSON": "text/x-coffeescript",
"CSS": "text/css",
"ChucK": "text/x-java",
"Clojure": "text/x-clojure",
"Closure Templates": "text/x-soy",
"CoffeeScript": "text/x-coffeescript",
"Common Lisp": "text/x-common-lisp",
"Component Pascal": "text/x-pascal",
"Crystal": "text/x-crystal",
"Cuda": "text/x-c++src",
"Cycript": "text/javascript",
"Cython": "text/x-cython",
"D": "text/x-d",
"DTrace": "text/x-csrc",
"Dart": "application/dart",
"Diff": "text/x-diff",
"Dockerfile": "text/x-dockerfile",
"Dylan": "text/x-dylan",
"EBNF": "text/x-ebnf",
"ECL": "text/x-ecl",
"EQ": "text/x-csharp",
"Eagle": "text/xml",
"Easybuild": "text/x-python",
"Ecere Projects": "application/json",
"Eiffel": "text/x-eiffel",
"Elm": "text/x-elm",
"Emacs Lisp": "text/x-common-lisp",
"EmberScript": "text/x-coffeescript",
"Erlang": "text/x-erlang",
"F#": "text/x-fsharp",
"Factor": "text/x-factor",
"Forth": "text/x-forth",
"Fortran": "text/x-fortran",
"GCC Machine Description": "text/x-common-lisp",
"AGS Script": "text/x-c++src",
"APL": "text/apl",
"ASN.1": "text/x-ttcn-asn",
"ASP": "application/x-aspx",
"Alpine Abuild": "text/x-sh",
"AngelScript": "text/x-c++src",
"Ant Build System": "application/xml",
"Apex": "text/x-java",
"Asymptote": "text/x-kotlin",
"Brainfuck": "text/x-brainfuck",
"C": "text/x-csrc",
"C#": "text/x-csharp",
"C++": "text/x-c++src",
"C2hs Haskell": "text/x-haskell",
"CMake": "text/x-cmake",
"COBOL": "text/x-cobol",
"COLLADA": "text/xml",
"CSON": "text/x-coffeescript",
"CSS": "text/css",
"ChucK": "text/x-java",
"Clojure": "text/x-clojure",
"Closure Templates": "text/x-soy",
"Cloud Firestore Security Rules": "text/css",
"CoffeeScript": "text/x-coffeescript",
"Common Lisp": "text/x-common-lisp",
"Common Workflow Language": "text/x-yaml",
"Component Pascal": "text/x-pascal",
"Crystal": "text/x-crystal",
"Cuda": "text/x-c++src",
"Cycript": "text/javascript",
"Cython": "text/x-cython",
"D": "text/x-d",
"DTrace": "text/x-csrc",
"Dart": "application/dart",
"Diff": "text/x-diff",
"Dockerfile": "text/x-dockerfile",
"Dylan": "text/x-dylan",
"EBNF": "text/x-ebnf",
"ECL": "text/x-ecl",
"EQ": "text/x-csharp",
"Eagle": "text/xml",
"Easybuild": "text/x-python",
"Ecere Projects": "application/json",
"Edje Data Collection": "application/json",
"Eiffel": "text/x-eiffel",
"Elm": "text/x-elm",
"Emacs Lisp": "text/x-common-lisp",
"EmberScript": "text/x-coffeescript",
"Erlang": "text/x-erlang",
"F#": "text/x-fsharp",
"Factor": "text/x-factor",
"Forth": "text/x-forth",
"Fortran": "text/x-fortran",
"GCC Machine Description": "text/x-common-lisp",
"GN": "text/x-python",
"Game Maker Language": "text/x-c++src",
"Genshi": "text/xml",
"Gentoo Ebuild": "text/x-sh",
"Gentoo Eclass": "text/x-sh",
"Git Attributes": "text/x-sh",
"Git Config": "text/x-properties",
"Glyph": "text/x-tcl",
"Go": "text/x-go",
"Grammatical Framework": "text/x-haskell",
"Groovy": "text/x-groovy",
"Groovy Server Pages": "application/x-jsp",
"HCL": "text/x-ruby",
"HTML": "text/html",
"HTML+Django": "text/x-django",
"HTML+ECR": "text/html",
"HTML+EEX": "text/html",
"HTML+ERB": "application/x-erb",
"HTML+PHP": "application/x-httpd-php",
"HTTP": "message/http",
"Hack": "application/x-httpd-php",
"Haml": "text/x-haml",
"Haskell": "text/x-haskell",
"Haxe": "text/x-haxe",
"IDL": "text/x-idl",
"INI": "text/x-properties",
"IRC log": "text/mirc",
"JSON": "application/json",
"JSON5": "application/json",
"JSONiq": "application/json",
"JSX": "text/jsx",
"Java": "text/x-java",
"Java Server Pages": "application/x-jsp",
"JavaScript": "text/javascript",
"Julia": "text/x-julia",
"Jupyter Notebook": "application/json",
"Kit": "text/html",
"Kotlin": "text/x-kotlin",
"LFE": "text/x-common-lisp",
"LabVIEW": "text/xml",
"Latte": "text/x-smarty",
"Less": "text/css",
"Literate Haskell": "text/x-literate-haskell",
"LiveScript": "text/x-livescript",
"LookML": "text/x-yaml",
"Lua": "text/x-lua",
"M": "text/x-mumps",
"MTML": "text/html",
"MUF": "text/x-forth",
"Makefile": "text/x-cmake",
"Markdown": "text/x-gfm",
"Marko": "text/html",
"Mathematica": "text/x-mathematica",
"Matlab": "text/x-octave",
"Maven POM": "text/xml",
"Max": "application/json",
"Metal": "text/x-c++src",
"Mirah": "text/x-ruby",
"Modelica": "text/x-modelica",
"NSIS": "text/x-nsis",
"NetLogo": "text/x-common-lisp",
"NewLisp": "text/x-common-lisp",
"Nginx": "text/x-nginx-conf",
"Nu": "text/x-scheme",
"NumPy": "text/x-python",
"OCaml": "text/x-ocaml",
"Objective-C": "text/x-objectivec",
"Objective-C++": "text/x-objectivec",
"OpenCL": "text/x-csrc",
"OpenRC runscript": "text/x-sh",
"Oz": "text/x-oz",
"PHP": "application/x-httpd-php",
"PLSQL": "text/x-plsql",
"PLpgSQL": "text/x-sql",
"Pascal": "text/x-pascal",
"Perl": "text/x-perl",
"Perl 6": "text/x-perl",
"Pic": "text/troff",
"Pod": "text/x-perl",
"PowerShell": "application/x-powershell",
"Protocol Buffer": "text/x-protobuf",
"Public Key": "application/pgp",
"Pug": "text/x-pug",
"Puppet": "text/x-puppet",
"PureScript": "text/x-haskell",
"Python": "text/x-python",
"R": "text/x-rsrc",
"RAML": "text/x-yaml",
"RHTML": "application/x-erb",
"RMarkdown": "text/x-gfm",
"RPM Spec": "text/x-rpm-spec",
"Reason": "text/x-rustsrc",
"Roff": "text/troff",
"Rouge": "text/x-clojure",
"Ruby": "text/x-ruby",
"Rust": "text/x-rustsrc",
"SAS": "text/x-sas",
"SCSS": "text/x-scss",
"SPARQL": "application/sparql-query",
"SQL": "text/x-sql",
"SQLPL": "text/x-sql",
"SRecode Template": "text/x-common-lisp",
"SVG": "text/xml",
"Sage": "text/x-python",
"SaltStack": "text/x-yaml",
"Sass": "text/x-sass",
"Scala": "text/x-scala",
"Scheme": "text/x-scheme",
"Shell": "text/x-sh",
"ShellSession": "text/x-sh",
"Slim": "text/x-slim",
"Smalltalk": "text/x-stsrc",
"Smarty": "text/x-smarty",
"Squirrel": "text/x-c++src",
"Standard ML": "text/x-ocaml",
"Sublime Text Config": "text/javascript",
"Swift": "text/x-swift",
"SystemVerilog": "text/x-systemverilog",
"TOML": "text/x-toml",
"Tcl": "text/x-tcl",
"Tcsh": "text/x-sh",
"TeX": "text/x-stex",
"Terra": "text/x-lua",
"Textile": "text/x-textile",
"Turtle": "text/turtle",
"Twig": "text/x-twig",
"TypeScript": "application/typescript",
"Unified Parallel C": "text/x-csrc",
"Unity3D Asset": "text/x-yaml",
"Uno": "text/x-csharp",
"UnrealScript": "text/x-java",
"VHDL": "text/x-vhdl",
"Verilog": "text/x-verilog",
"Visual Basic": "text/x-vb",
"Volt": "text/x-d",
"WebAssembly": "text/x-common-lisp",
"WebIDL": "text/x-webidl",
"XC": "text/x-csrc",
"XML": "text/xml",
"XPages": "text/xml",
"XProc": "text/xml",
"XQuery": "application/xquery",
"XS": "text/x-csrc",
"XSLT": "text/xml",
"YAML": "text/x-yaml",
"edn": "text/x-clojure",
"reStructuredText": "text/x-rst",
"wisp": "text/x-clojure",
"HCL": "text/x-ruby",
"HTML": "text/html",
"HTML+Django": "text/x-django",
"HTML+ECR": "text/html",
"HTML+EEX": "text/html",
"HTML+ERB": "application/x-erb",
"HTML+PHP": "application/x-httpd-php",
"HTML+Razor": "text/html",
"HTTP": "message/http",
"Hack": "application/x-httpd-php",
"Haml": "text/x-haml",
"Haskell": "text/x-haskell",
"Haxe": "text/x-haxe",
"IDL": "text/x-idl",
"INI": "text/x-properties",
"IRC log": "text/mirc",
"Ignore List": "text/x-sh",
"JSON": "application/json",
"JSON with Comments": "text/javascript",
"JSON5": "application/json",
"JSONLD": "application/json",
"JSONiq": "application/json",
"JSX": "text/jsx",
"Java": "text/x-java",
"Java Properties": "text/x-properties",
"Java Server Pages": "application/x-jsp",
"JavaScript": "text/javascript",
"Julia": "text/x-julia",
"Jupyter Notebook": "application/json",
"KiCad Layout": "text/x-common-lisp",
"Kit": "text/html",
"Kotlin": "text/x-kotlin",
"LFE": "text/x-common-lisp",
"LTspice Symbol": "text/x-spreadsheet",
"LabVIEW": "text/xml",
"Latte": "text/x-smarty",
"Less": "text/css",
"Literate Haskell": "text/x-literate-haskell",
"LiveScript": "text/x-livescript",
"LookML": "text/x-yaml",
"Lua": "text/x-lua",
"M": "text/x-mumps",
"MATLAB": "text/x-octave",
"MTML": "text/html",
"MUF": "text/x-forth",
"Makefile": "text/x-cmake",
"Markdown": "text/x-gfm",
"Marko": "text/html",
"Mathematica": "text/x-mathematica",
"Maven POM": "text/xml",
"Max": "application/json",
"Metal": "text/x-c++src",
"Mirah": "text/x-ruby",
"Modelica": "text/x-modelica",
"NSIS": "text/x-nsis",
"NetLogo": "text/x-common-lisp",
"NewLisp": "text/x-common-lisp",
"Nginx": "text/x-nginx-conf",
"Nu": "text/x-scheme",
"NumPy": "text/x-python",
"OCaml": "text/x-ocaml",
"Objective-C": "text/x-objectivec",
"Objective-C++": "text/x-objectivec",
"OpenCL": "text/x-csrc",
"OpenRC runscript": "text/x-sh",
"Oz": "text/x-oz",
"PHP": "application/x-httpd-php",
"PLSQL": "text/x-plsql",
"PLpgSQL": "text/x-sql",
"Pascal": "text/x-pascal",
"Perl": "text/x-perl",
"Perl 6": "text/x-perl",
"Pic": "text/troff",
"Pod": "text/x-perl",
"PowerShell": "application/x-powershell",
"Protocol Buffer": "text/x-protobuf",
"Public Key": "application/pgp",
"Pug": "text/x-pug",
"Puppet": "text/x-puppet",
"PureScript": "text/x-haskell",
"Python": "text/x-python",
"R": "text/x-rsrc",
"RAML": "text/x-yaml",
"RHTML": "application/x-erb",
"RMarkdown": "text/x-gfm",
"RPM Spec": "text/x-rpm-spec",
"Reason": "text/x-rustsrc",
"Roff": "text/troff",
"Roff Manpage": "text/troff",
"Rouge": "text/x-clojure",
"Ruby": "text/x-ruby",
"Rust": "text/x-rustsrc",
"SAS": "text/x-sas",
"SCSS": "text/x-scss",
"SPARQL": "application/sparql-query",
"SQL": "text/x-sql",
"SQLPL": "text/x-sql",
"SRecode Template": "text/x-common-lisp",
"SVG": "text/xml",
"Sage": "text/x-python",
"SaltStack": "text/x-yaml",
"Sass": "text/x-sass",
"Scala": "text/x-scala",
"Scheme": "text/x-scheme",
"Shell": "text/x-sh",
"ShellSession": "text/x-sh",
"Slim": "text/x-slim",
"Smalltalk": "text/x-stsrc",
"Smarty": "text/x-smarty",
"Squirrel": "text/x-c++src",
"Standard ML": "text/x-ocaml",
"Swift": "text/x-swift",
"SystemVerilog": "text/x-systemverilog",
"TOML": "text/x-toml",
"Tcl": "text/x-tcl",
"Tcsh": "text/x-sh",
"TeX": "text/x-stex",
"Terra": "text/x-lua",
"Textile": "text/x-textile",
"Turtle": "text/turtle",
"Twig": "text/x-twig",
"TypeScript": "application/typescript",
"Unified Parallel C": "text/x-csrc",
"Unity3D Asset": "text/x-yaml",
"Uno": "text/x-csharp",
"UnrealScript": "text/x-java",
"VHDL": "text/x-vhdl",
"Verilog": "text/x-verilog",
"Visual Basic": "text/x-vb",
"Volt": "text/x-d",
"WebAssembly": "text/x-common-lisp",
"WebIDL": "text/x-webidl",
"Windows Registry Entries": "text/x-properties",
"X BitMap": "text/x-csrc",
"X PixMap": "text/x-csrc",
"XC": "text/x-csrc",
"XML": "text/xml",
"XPages": "text/xml",
"XProc": "text/xml",
"XQuery": "application/xquery",
"XS": "text/x-csrc",
"XSLT": "text/xml",
"YAML": "text/x-yaml",
"edn": "text/x-clojure",
"reStructuredText": "text/x-rst",
"wisp": "text/x-clojure",
}

View File

@ -1,295 +1,322 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
var LanguagesType = map[string]int{
"1C Enterprise": 2,
"ABAP": 2,
"ABNF": 1,
"AGS Script": 2,
"AMPL": 2,
"ANTLR": 2,
"API Blueprint": 3,
"APL": 2,
"ASN.1": 1,
"ASP": 2,
"ATS": 2,
"ActionScript": 2,
"Ada": 2,
"Agda": 2,
"Alloy": 2,
"Alpine Abuild": 2,
"Ant Build System": 1,
"ApacheConf": 3,
"Apex": 2,
"Apollo Guidance Computer": 2,
"AppleScript": 2,
"Arc": 2,
"Arduino": 2,
"AsciiDoc": 4,
"AspectJ": 2,
"Assembly": 2,
"Augeas": 2,
"AutoHotkey": 2,
"AutoIt": 2,
"Awk": 2,
"Batchfile": 2,
"Befunge": 2,
"Bison": 2,
"BitBake": 2,
"Blade": 3,
"BlitzBasic": 2,
"BlitzMax": 2,
"Bluespec": 2,
"Boo": 2,
"Brainfuck": 2,
"Brightscript": 2,
"Bro": 2,
"C": 2,
"C#": 2,
"C++": 2,
"C-ObjDump": 1,
"C2hs Haskell": 2,
"CLIPS": 2,
"CMake": 2,
"COBOL": 2,
"COLLADA": 1,
"CSON": 1,
"CSS": 3,
"CSV": 1,
"CWeb": 2,
"Cap'n Proto": 2,
"CartoCSS": 2,
"Ceylon": 2,
"Chapel": 2,
"Charity": 2,
"ChucK": 2,
"Cirru": 2,
"Clarion": 2,
"Clean": 2,
"Click": 2,
"Clojure": 2,
"Closure Templates": 3,
"CoffeeScript": 2,
"ColdFusion": 2,
"ColdFusion CFC": 2,
"Common Lisp": 2,
"Component Pascal": 2,
"Cool": 2,
"Coq": 2,
"Cpp-ObjDump": 1,
"Creole": 4,
"Crystal": 2,
"Csound": 2,
"Csound Document": 2,
"Csound Score": 2,
"Cuda": 2,
"Cycript": 2,
"Cython": 2,
"D": 2,
"D-ObjDump": 1,
"DIGITAL Command Language": 2,
"DM": 2,
"DNS Zone": 1,
"DTrace": 2,
"Darcs Patch": 1,
"Dart": 2,
"Diff": 1,
"Dockerfile": 1,
"Dogescript": 2,
"Dylan": 2,
"E": 2,
"EBNF": 1,
"ECL": 2,
"ECLiPSe": 2,
"EJS": 3,
"EQ": 2,
"Eagle": 3,
"Easybuild": 1,
"Ecere Projects": 1,
"Eiffel": 2,
"Elixir": 2,
"Elm": 2,
"Emacs Lisp": 2,
"EmberScript": 2,
"Erlang": 2,
"F#": 2,
"FLUX": 2,
"Factor": 2,
"Fancy": 2,
"Fantom": 2,
"Filebench WML": 2,
"Filterscript": 2,
"Formatted": 1,
"Forth": 2,
"Fortran": 2,
"FreeMarker": 2,
"Frege": 2,
"G-code": 1,
"GAMS": 2,
"GAP": 2,
"1C Enterprise": 2,
"ABAP": 2,
"ABNF": 1,
"AGS Script": 2,
"AMPL": 2,
"ANTLR": 2,
"API Blueprint": 3,
"APL": 2,
"ASN.1": 1,
"ASP": 2,
"ATS": 2,
"ActionScript": 2,
"Ada": 2,
"Adobe Font Metrics": 1,
"Agda": 2,
"Alloy": 2,
"Alpine Abuild": 2,
"AngelScript": 2,
"Ant Build System": 1,
"ApacheConf": 1,
"Apex": 2,
"Apollo Guidance Computer": 2,
"AppleScript": 2,
"Arc": 2,
"AsciiDoc": 4,
"AspectJ": 2,
"Assembly": 2,
"Asymptote": 2,
"Augeas": 2,
"AutoHotkey": 2,
"AutoIt": 2,
"Awk": 2,
"Ballerina": 2,
"Batchfile": 2,
"Befunge": 2,
"Bison": 2,
"BitBake": 2,
"Blade": 3,
"BlitzBasic": 2,
"BlitzMax": 2,
"Bluespec": 2,
"Boo": 2,
"Brainfuck": 2,
"Brightscript": 2,
"Bro": 2,
"C": 2,
"C#": 2,
"C++": 2,
"C-ObjDump": 1,
"C2hs Haskell": 2,
"CLIPS": 2,
"CMake": 2,
"COBOL": 2,
"COLLADA": 1,
"CSON": 1,
"CSS": 3,
"CSV": 1,
"CWeb": 2,
"Cap'n Proto": 2,
"CartoCSS": 2,
"Ceylon": 2,
"Chapel": 2,
"Charity": 2,
"ChucK": 2,
"Cirru": 2,
"Clarion": 2,
"Clean": 2,
"Click": 2,
"Clojure": 2,
"Closure Templates": 3,
"Cloud Firestore Security Rules": 1,
"CoNLL-U": 1,
"CoffeeScript": 2,
"ColdFusion": 2,
"ColdFusion CFC": 2,
"Common Lisp": 2,
"Common Workflow Language": 2,
"Component Pascal": 2,
"Cool": 2,
"Coq": 2,
"Cpp-ObjDump": 1,
"Creole": 4,
"Crystal": 2,
"Csound": 2,
"Csound Document": 2,
"Csound Score": 2,
"Cuda": 2,
"Cycript": 2,
"Cython": 2,
"D": 2,
"D-ObjDump": 1,
"DIGITAL Command Language": 2,
"DM": 2,
"DNS Zone": 1,
"DTrace": 2,
"Darcs Patch": 1,
"Dart": 2,
"DataWeave": 2,
"Diff": 1,
"Dockerfile": 2,
"Dogescript": 2,
"Dylan": 2,
"E": 2,
"EBNF": 1,
"ECL": 2,
"ECLiPSe": 2,
"EJS": 3,
"EML": 1,
"EQ": 2,
"Eagle": 1,
"Easybuild": 1,
"Ecere Projects": 1,
"Edje Data Collection": 1,
"Eiffel": 2,
"Elixir": 2,
"Elm": 2,
"Emacs Lisp": 2,
"EmberScript": 2,
"Erlang": 2,
"F#": 2,
"F*": 2,
"FIGlet Font": 1,
"FLUX": 2,
"Factor": 2,
"Fancy": 2,
"Fantom": 2,
"Filebench WML": 2,
"Filterscript": 2,
"Formatted": 1,
"Forth": 2,
"Fortran": 2,
"FreeMarker": 2,
"Frege": 2,
"G-code": 1,
"GAMS": 2,
"GAP": 2,
"GCC Machine Description": 2,
"GDB": 2,
"GDScript": 2,
"GLSL": 2,
"GN": 1,
"Game Maker Language": 2,
"Genie": 2,
"Genshi": 2,
"Gentoo Ebuild": 2,
"Gentoo Eclass": 2,
"Gettext Catalog": 4,
"Gherkin": 2,
"Glyph": 2,
"Gnuplot": 2,
"Go": 2,
"Golo": 2,
"Gosu": 2,
"Grace": 2,
"Gradle": 1,
"Grammatical Framework": 2,
"Graph Modeling Language": 1,
"GraphQL": 1,
"Graphviz (DOT)": 1,
"Groovy": 2,
"Groovy Server Pages": 2,
"HCL": 2,
"HLSL": 2,
"HTML": 3,
"HTML+Django": 3,
"HTML+ECR": 3,
"HTML+EEX": 3,
"HTML+ERB": 3,
"HTML+PHP": 3,
"HTTP": 1,
"Hack": 2,
"Haml": 3,
"Handlebars": 3,
"Harbour": 2,
"Haskell": 2,
"Haxe": 2,
"Hy": 2,
"HyPhy": 2,
"IDL": 2,
"IGOR Pro": 2,
"INI": 1,
"IRC log": 1,
"Idris": 2,
"Inform 7": 2,
"Inno Setup": 2,
"Io": 2,
"Ioke": 2,
"Isabelle": 2,
"Isabelle ROOT": 2,
"J": 2,
"JFlex": 2,
"JSON": 1,
"JSON5": 1,
"JSONLD": 1,
"JSONiq": 2,
"JSX": 2,
"Jasmin": 2,
"Java": 2,
"Java Server Pages": 2,
"JavaScript": 2,
"Jison": 2,
"Jison Lex": 2,
"Jolie": 2,
"Julia": 2,
"Jupyter Notebook": 3,
"KRL": 2,
"KiCad": 2,
"Kit": 3,
"Kotlin": 2,
"LFE": 2,
"LLVM": 2,
"LOLCODE": 2,
"LSL": 2,
"LabVIEW": 2,
"Lasso": 2,
"Latte": 3,
"Lean": 2,
"Less": 3,
"Lex": 2,
"LilyPond": 2,
"Limbo": 2,
"Linker Script": 1,
"Linux Kernel Module": 1,
"Liquid": 3,
"Literate Agda": 2,
"Literate CoffeeScript": 2,
"Literate Haskell": 2,
"LiveScript": 2,
"Logos": 2,
"Logtalk": 2,
"LookML": 2,
"LoomScript": 2,
"Lua": 2,
"M": 2,
"M4": 2,
"M4Sugar": 2,
"MAXScript": 2,
"MQL4": 2,
"MQL5": 2,
"MTML": 3,
"MUF": 2,
"Makefile": 2,
"Mako": 2,
"Markdown": 4,
"Marko": 3,
"Mask": 3,
"Mathematica": 2,
"Matlab": 2,
"Maven POM": 1,
"Max": 2,
"MediaWiki": 4,
"Mercury": 2,
"Meson": 2,
"Metal": 2,
"MiniD": 2,
"Mirah": 2,
"Modelica": 2,
"Modula-2": 2,
"Module Management System": 2,
"Monkey": 2,
"Moocode": 2,
"MoonScript": 2,
"Myghty": 2,
"NCL": 2,
"NL": 1,
"NSIS": 2,
"Nemerle": 2,
"NetLinx": 2,
"NetLinx+ERB": 2,
"NetLogo": 2,
"NewLisp": 2,
"Nginx": 3,
"Nim": 2,
"Ninja": 1,
"Nit": 2,
"Nix": 2,
"Nu": 2,
"NumPy": 2,
"OCaml": 2,
"ObjDump": 1,
"Objective-C": 2,
"Objective-C++": 2,
"Objective-J": 2,
"Omgrofl": 2,
"Opa": 2,
"Opal": 2,
"OpenCL": 2,
"OpenEdge ABL": 2,
"OpenRC runscript": 2,
"OpenSCAD": 2,
"OpenType Feature File": 1,
"Game Maker Language": 2,
"Genie": 2,
"Genshi": 2,
"Gentoo Ebuild": 2,
"Gentoo Eclass": 2,
"Gerber Image": 1,
"Gettext Catalog": 4,
"Gherkin": 2,
"Git Attributes": 1,
"Git Config": 1,
"Glyph": 2,
"Glyph Bitmap Distribution Format": 1,
"Gnuplot": 2,
"Go": 2,
"Golo": 2,
"Gosu": 2,
"Grace": 2,
"Gradle": 1,
"Grammatical Framework": 2,
"Graph Modeling Language": 1,
"GraphQL": 1,
"Graphviz (DOT)": 1,
"Groovy": 2,
"Groovy Server Pages": 2,
"HAProxy": 1,
"HCL": 2,
"HLSL": 2,
"HTML": 3,
"HTML+Django": 3,
"HTML+ECR": 3,
"HTML+EEX": 3,
"HTML+ERB": 3,
"HTML+PHP": 3,
"HTML+Razor": 3,
"HTTP": 1,
"HXML": 1,
"Hack": 2,
"Haml": 3,
"Handlebars": 3,
"Harbour": 2,
"Haskell": 2,
"Haxe": 2,
"HiveQL": 2,
"Hy": 2,
"HyPhy": 2,
"IDL": 2,
"IGOR Pro": 2,
"INI": 1,
"IRC log": 1,
"Idris": 2,
"Ignore List": 1,
"Inform 7": 2,
"Inno Setup": 2,
"Io": 2,
"Ioke": 2,
"Isabelle": 2,
"Isabelle ROOT": 2,
"J": 2,
"JFlex": 2,
"JSON": 1,
"JSON with Comments": 1,
"JSON5": 1,
"JSONLD": 1,
"JSONiq": 2,
"JSX": 2,
"Jasmin": 2,
"Java": 2,
"Java Properties": 1,
"Java Server Pages": 2,
"JavaScript": 2,
"Jison": 2,
"Jison Lex": 2,
"Jolie": 2,
"Julia": 2,
"Jupyter Notebook": 3,
"KRL": 2,
"KiCad Layout": 1,
"KiCad Legacy Layout": 1,
"KiCad Schematic": 1,
"Kit": 3,
"Kotlin": 2,
"LFE": 2,
"LLVM": 2,
"LOLCODE": 2,
"LSL": 2,
"LTspice Symbol": 1,
"LabVIEW": 2,
"Lasso": 2,
"Latte": 3,
"Lean": 2,
"Less": 3,
"Lex": 2,
"LilyPond": 2,
"Limbo": 2,
"Linker Script": 1,
"Linux Kernel Module": 1,
"Liquid": 3,
"Literate Agda": 2,
"Literate CoffeeScript": 2,
"Literate Haskell": 2,
"LiveScript": 2,
"Logos": 2,
"Logtalk": 2,
"LookML": 2,
"LoomScript": 2,
"Lua": 2,
"M": 2,
"M4": 2,
"M4Sugar": 2,
"MATLAB": 2,
"MAXScript": 2,
"MQL4": 2,
"MQL5": 2,
"MTML": 3,
"MUF": 2,
"Makefile": 2,
"Mako": 2,
"Markdown": 4,
"Marko": 3,
"Mask": 3,
"Mathematica": 2,
"Maven POM": 1,
"Max": 2,
"MediaWiki": 4,
"Mercury": 2,
"Meson": 2,
"Metal": 2,
"MiniD": 2,
"Mirah": 2,
"Modelica": 2,
"Modula-2": 2,
"Modula-3": 2,
"Module Management System": 2,
"Monkey": 2,
"Moocode": 2,
"MoonScript": 2,
"Myghty": 2,
"NCL": 2,
"NL": 1,
"NSIS": 2,
"Nearley": 2,
"Nemerle": 2,
"NetLinx": 2,
"NetLinx+ERB": 2,
"NetLogo": 2,
"NewLisp": 2,
"Nextflow": 2,
"Nginx": 1,
"Nim": 2,
"Ninja": 1,
"Nit": 2,
"Nix": 2,
"Nu": 2,
"NumPy": 2,
"OCaml": 2,
"ObjDump": 1,
"Objective-C": 2,
"Objective-C++": 2,
"Objective-J": 2,
"Omgrofl": 2,
"Opa": 2,
"Opal": 2,
"OpenCL": 2,
"OpenEdge ABL": 2,
"OpenRC runscript": 2,
"OpenSCAD": 2,
"OpenType Feature File": 1,
"Org": 4,
"Ox": 2,
"Oxygene": 2,
"Oz": 2,
"P4": 2,
"PAWN": 2,
"PHP": 2,
"PLSQL": 2,
"PLpgSQL": 2,
@ -300,6 +327,7 @@ var LanguagesType = map[string]int{
"Parrot Assembly": 2,
"Parrot Internal Representation": 2,
"Pascal": 2,
"Pawn": 2,
"Pep8": 2,
"Perl": 2,
"Perl 6": 2,
@ -309,19 +337,21 @@ var LanguagesType = map[string]int{
"PigLatin": 2,
"Pike": 2,
"Pod": 4,
"Pod 6": 4,
"PogoScript": 2,
"Pony": 2,
"PostCSS": 3,
"PostScript": 3,
"PowerBuilder": 2,
"PowerShell": 2,
"Processing": 2,
"Prolog": 2,
"Propeller Spin": 2,
"Protocol Buffer": 3,
"Protocol Buffer": 1,
"Public Key": 1,
"Pug": 3,
"Puppet": 2,
"Pure Data": 2,
"Pure Data": 1,
"PureBasic": 2,
"PureScript": 2,
"Python": 2,
@ -329,6 +359,7 @@ var LanguagesType = map[string]int{
"Python traceback": 1,
"QML": 2,
"QMake": 2,
"Quake": 2,
"R": 2,
"RAML": 3,
"RDoc": 4,
@ -336,6 +367,7 @@ var LanguagesType = map[string]int{
"REXX": 2,
"RHTML": 3,
"RMarkdown": 4,
"RPC": 2,
"RPM Spec": 1,
"RUNOFF": 3,
"Racket": 2,
@ -352,6 +384,7 @@ var LanguagesType = map[string]int{
"Ring": 2,
"RobotFramework": 2,
"Roff": 3,
"Roff Manpage": 3,
"Rouge": 2,
"Ruby": 2,
"Rust": 2,
@ -378,10 +411,12 @@ var LanguagesType = map[string]int{
"ShellSession": 2,
"Shen": 2,
"Slash": 2,
"Slice": 2,
"Slim": 3,
"Smali": 2,
"Smalltalk": 2,
"Smarty": 2,
"Solidity": 2,
"SourcePawn": 2,
"Spline Font Database": 1,
"Squirrel": 2,
@ -390,7 +425,7 @@ var LanguagesType = map[string]int{
"Stata": 2,
"Stylus": 3,
"SubRip Text": 1,
"Sublime Text Config": 1,
"SugarSS": 3,
"SuperCollider": 2,
"Swift": 2,
"SystemVerilog": 2,
@ -427,34 +462,45 @@ var LanguagesType = map[string]int{
"Vue": 3,
"Wavefront Material": 1,
"Wavefront Object": 1,
"Web Ontology Language": 3,
"Web Ontology Language": 1,
"WebAssembly": 2,
"WebIDL": 2,
"Windows Registry Entries": 1,
"World of Warcraft Addon Data": 1,
"X10": 2,
"XC": 2,
"XCompose": 1,
"XML": 1,
"XPages": 2,
"XProc": 2,
"XQuery": 2,
"XS": 2,
"XSLT": 2,
"Xojo": 2,
"Xtend": 2,
"YAML": 1,
"YANG": 1,
"Yacc": 2,
"Zephir": 2,
"Zimpl": 2,
"desktop": 1,
"eC": 2,
"edn": 1,
"fish": 2,
"mupad": 2,
"nesC": 2,
"ooc": 2,
"reStructuredText": 4,
"wisp": 2,
"xBase": 2,
"X BitMap": 1,
"X Font Directory Index": 1,
"X PixMap": 1,
"X10": 2,
"XC": 2,
"XCompose": 1,
"XML": 1,
"XPages": 1,
"XProc": 2,
"XQuery": 2,
"XS": 2,
"XSLT": 2,
"Xojo": 2,
"Xtend": 2,
"YAML": 1,
"YANG": 1,
"YARA": 2,
"YASnippet": 3,
"Yacc": 2,
"Zephir": 2,
"Zig": 2,
"Zimpl": 2,
"desktop": 1,
"eC": 2,
"edn": 1,
"fish": 2,
"mupad": 2,
"nanorc": 1,
"nesC": 2,
"ooc": 2,
"q": 2,
"reStructuredText": 4,
"sed": 2,
"wdl": 2,
"wisp": 2,
"xBase": 2,
}

View File

@ -1,5 +1,5 @@
// Code generated by gopkg.in/src-d/enry.v1/internal/code-generator DO NOT EDIT.
// Extracted from github/linguist commit: d5c8db3fb91963c4b2762ca2ea2ff7cfac109f68
// Extracted from github/linguist commit: e4560984058b4726010ca4b8f03ed9d0f8f464db
package data
@ -10,7 +10,6 @@ var VendorMatchers = substring.Or(
substring.Regexp(`^[Dd]ependencies/`),
substring.Regexp(`(^|/)dist/`),
substring.Regexp(`^deps/`),
substring.Regexp(`^tools/`),
substring.Regexp(`(^|/)configure$`),
substring.Regexp(`(^|/)config.guess$`),
substring.Regexp(`(^|/)config.sub$`),
@ -32,13 +31,15 @@ var VendorMatchers = substring.Or(
substring.Regexp(`(^|/)bootstrap([^.]*)\.(js|css|less|scss|styl)$`),
substring.Regexp(`(^|/)custom\.bootstrap([^\s]*)(js|css|less|scss|styl)$`),
substring.Regexp(`(^|/)font-awesome\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)font-awesome/.*\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)foundation\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)normalize\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)skeleton\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)[Bb]ourbon/.*\.(css|less|scss|styl)$`),
substring.Regexp(`(^|/)animate\.(css|less|scss|styl)$`),
substring.Regexp(`third[-_]?party/`),
substring.Regexp(`3rd[-_]?party/`),
substring.Regexp(`(^|/)materialize\.(css|less|scss|styl|js)$`),
substring.Regexp(`(^|/)select2/.*\.(css|scss|js)$`),
substring.Regexp(`(3rd|[Tt]hird)[-_]?[Pp]arty/`),
substring.Regexp(`vendors?/`),
substring.Regexp(`extern(al)?/`),
substring.Regexp(`(^|/)[Vv]+endor/`),
@ -53,6 +54,9 @@ var VendorMatchers = substring.Or(
substring.Regexp(`jquery.fancybox.(js|css)`),
substring.Regexp(`fuelux.js`),
substring.Regexp(`(^|/)jquery\.fileupload(-\w+)?\.js$`),
substring.Regexp(`jquery.dataTables.js`),
substring.Regexp(`bootbox.js`),
substring.Regexp(`pdf.worker.js`),
substring.Regexp(`(^|/)slick\.\w+.js$`),
substring.Regexp(`(^|/)Leaflet\.Coordinates-\d+\.\d+\.\d+\.src\.js$`),
substring.Regexp(`leaflet.draw-src.js`),
@ -63,6 +67,7 @@ var VendorMatchers = substring.Or(
substring.Regexp(`wicket-leaflet.js`),
substring.Regexp(`.sublime-project`),
substring.Regexp(`.sublime-workspace`),
substring.Regexp(`.vscode`),
substring.Regexp(`(^|/)prototype(.*)\.js$`),
substring.Regexp(`(^|/)effects\.js$`),
substring.Regexp(`(^|/)controls\.js$`),
@ -87,6 +92,7 @@ var VendorMatchers = substring.Or(
substring.Regexp(`(^|/)angular([^.]*)\.js$`),
substring.Regexp(`(^|\/)d3(\.v\d+)?([^.]*)\.js$`),
substring.Regexp(`(^|/)react(-[^.]*)?\.js$`),
substring.Regexp(`(^|/)flow-typed/.*\.js$`),
substring.Regexp(`(^|/)modernizr\-\d\.\d+(\.\d+)?\.js$`),
substring.Regexp(`(^|/)modernizr\.custom\.\d+\.js$`),
substring.Regexp(`(^|/)knockout-(\d+\.){3}(debug\.)?js$`),
@ -98,8 +104,7 @@ var VendorMatchers = substring.Or(
substring.Regexp(`^.osx$`),
substring.Regexp(`\.xctemplate/`),
substring.Regexp(`\.imageset/`),
substring.Regexp(`^Carthage/`),
substring.Regexp(`^Pods/`),
substring.Regexp(`(^|/)Carthage/`),
substring.Regexp(`(^|/)Sparkle/`),
substring.Regexp(`Crashlytics.framework/`),
substring.Regexp(`Fabric.framework/`),
@ -112,6 +117,9 @@ var VendorMatchers = substring.Or(
substring.Regexp(`(^|/)gradlew$`),
substring.Regexp(`(^|/)gradlew\.bat$`),
substring.Regexp(`(^|/)gradle/wrapper/`),
substring.Regexp(`(^|/)mvnw$`),
substring.Regexp(`(^|/)mvnw\.cmd$`),
substring.Regexp(`(^|/)\.mvn/wrapper/`),
substring.Regexp(`-vsdoc\.js$`),
substring.Regexp(`\.intellisense\.js$`),
substring.Regexp(`(^|/)jquery([^.]*)\.validate(\.unobtrusive)?\.js$`),

View File

@ -2,12 +2,14 @@ package generator
import (
"bytes"
"gopkg.in/yaml.v2"
"io"
"io/ioutil"
"gopkg.in/yaml.v2"
)
// Vendor reads from fileToParse and builds source file from tmplPath. It complies with type File signature.
// Vendor generates regex matchers in Go for vendoring files/dirs.
// It is of generator.File type.
func Vendor(fileToParse, samplesDir, outPath, tmplPath, tmplName, commit string) error {
data, err := ioutil.ReadFile(fileToParse)
if err != nil {

View File

@ -20,7 +20,7 @@ const (
extensionsTmpl = "extension.go.tmpl"
// content.go generation
heuristicsRuby = ".linguist/lib/linguist/heuristics.rb"
heuristicsYAML = ".linguist/lib/linguist/heuristics.yml"
contentFile = "data/content.go"
contentTmplPath = "internal/code-generator/assets/content.go.tmpl"
contentTmpl = "content.go.tmpl"
@ -93,7 +93,7 @@ func main() {
fileList := []*generatorFiles{
{generator.Extensions, languagesYAML, "", extensionsFile, extensionsTmplPath, extensionsTmpl, commit},
{generator.Heuristics, heuristicsRuby, "", contentFile, contentTmplPath, contentTmpl, commit},
{generator.GenHeuristics, heuristicsYAML, "", contentFile, contentTmplPath, contentTmpl, commit},
{generator.Vendor, vendorYAML, "", vendorFile, vendorTmplPath, vendorTmpl, commit},
{generator.Documentation, documentationYAML, "", documentationFile, documentationTmplPath, documentationTmpl, commit},
{generator.Types, languagesYAML, "", typeFile, typeTmplPath, typeTmpl, commit},