Commit Graph

66 Commits

Author SHA1 Message Date
Alex Bezzubov
6d99af7bbc doc: reformat & clarify 2023-02-16 17:46:23 +01:00
Alex Bezzubov
3aeb9879da heuristics regexp engine configurable #2, skip rules at runtime 2023-01-19 19:50:22 +01:00
Alex Bezzubov
d8913b00e9 code-gen: re-generate code & fixtures 2022-12-25 22:37:52 +01:00
Alex
2059129b5e
Merge branch 'master' into spelling 2022-12-03 10:48:23 +01:00
Alex
a8344728a7
Merge pull request #143 from go-enry/re-collation-at-codegen
Move venrod RE collation at codegen
2022-12-02 10:11:39 +01:00
Alex Bezzubov
2c708f0b6c gen: re-generate aliases & test fixture for go 1.19 2022-12-02 09:55:50 +01:00
Alex Bezzubov
f4051b0f16 gen: re-generated vendors using build-time optimization 2022-12-01 22:16:54 +01:00
Josh Soref
2822da6054 spelling: sequentially
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
8363b28e63 spelling: receives
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Alex
9b19067edc
Merge pull request #133 from lafriks-fork/feat/generated_files_proto_go_sum
Generated proto file and PNP detection
2022-10-06 10:01:35 +02:00
Josh Soref
b4e2aae0cf spelling: identifies
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-09 00:45:29 -04:00
Josh Soref
cfef8c28e5 spelling: disambiguates
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-08 23:31:59 -04:00
Josh Soref
7b3e094013 spelling: arbitrary
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-08 22:52:56 -04:00
Lauris BH
05907fe7ec
Change Yarn PnP regex to include all .pnp.* files 2022-08-01 21:35:46 +03:00
Lauris BH
a6f32e054c
Remove go.sum from generated 2022-08-01 21:06:24 +03:00
Lauris BH
fe195c67a9
Generated go.sum and proto file detection 2022-07-18 15:08:20 +03:00
github-actions
60edc790b3 Updated Linguist to v7.21.0 2022-06-09 20:09:50 +00:00
github-actions
9f73cdf211 Updated Linguist to v7.20.0 2022-04-05 20:12:53 +00:00
Alex
6f052a7bc7
Merge pull request #112 from silverwind/poetry-generated
Add poetry.lock to generated files
2022-03-24 12:15:28 +01:00
Lauris BH
ae2b0576a7
Add check for non-backtracking subexpressions 2022-03-21 13:54:11 +02:00
silverwind
b1bf2238b3
Add poetry.lock to generated files
`poetry.lock` is a generated file by the python poetry package manager,
see https://python-poetry.org/docs/basic-usage/ for references.
2022-03-17 15:29:07 +01:00
Alex
6de77247e4
Revert "Update Linguist to v7.19.0" 2022-03-04 10:44:00 +01:00
github-actions
8eac4cab85 Updated Linguist to v7.19.0 2022-03-03 20:08:49 +00:00
github-actions
513c659119 Updated Linguist to v7.19.0 2022-02-21 20:08:34 +00:00
github-actions
2febea0489 Updated Linguist to v7.18.0 2021-12-15 20:08:13 +00:00
github-actions
b3ee64f627 Updated Linguist to v7.17.0 2021-11-14 18:33:24 +01:00
Luke Francl
02878b9c9f Rename CodemirrorMode to CodeMirrorMode
It is a bit of a Rubyism to translate "CodeMirror Mode" into "codemirror_mode".
This is more in line with Go practices.
2021-10-12 16:18:33 -07:00
Luke Francl
b6b72c6c08 Add documentation to LanguageInfo struct fields
These are adapted from https://github.com/github/linguist/blob/master/lib/linguist/languages.yml
2021-10-12 16:13:59 -07:00
Luke Francl
6212f1fcb4 Remove name -> LanguageInfo mapping per code review
The GetLanguageInfo method is now implemented in terms of GetLanguageInfoByID.
This is possible because you can use GetLanguageID to get the ID for a language.
2021-10-12 13:29:39 -07:00
Luke Francl
b248b21349 Expose LanguageInfo with all Linguist data
As discussed in https://github.com/go-enry/go-enry/issues/54, this provides an
API for accessing a LanguageInfo struct which is populated with all the data
from the Linguist YAML source file. Functions are provided to access the
LanguageInfo by name or ID.

The other top-level functions like GetLanguageExtensions, GetLanguageGroup, etc.
could in principle be implemented using this structure, which would simplify the
code generation. But that would be a big change so I didn't do any of that.
Perhaps in the next major version something like that would make sense.
2021-10-11 13:32:29 -07:00
Lauris BH
0affa3ccca Update to Linguist v7.16.1 2021-09-25 23:57:50 +03:00
Luke Francl
dfb8041dcc Update generated code for Linguist 7.14.0 2021-04-26 09:36:25 -07:00
Luke Francl
eb043e80a8 Add GetLanguageID function
The Linguist-defined language IDs are important to our use case because they are
used as database identifiers. This adds a new generator to extract the language
IDs into a map and uses that to implement GetLanguageID.

Because one language has the ID 0, there is no way to tell if a language name is
found or not. If desired, we could add this by returning (string, bool) from
GetLanguageID. But none of the other functions that take language names do this,
so I didn't want to introduce it here.
2021-04-13 11:49:21 -07:00
Lauris BH
c40b34c351 Sync with Liguist v7.13.0 2021-03-07 18:02:04 +02:00
Lauris BH
497e2f85d3 Sync with github/linguist version v7.12.2 2021-01-17 14:10:38 +02:00
Lauris BH
289ac3d9f0 Sync with linguist 7.12.1 2020-11-15 14:32:56 +02:00
Lauris BH
bc76dd38b0 sync to the latest github/linguist v7.11.1 2020-10-12 12:32:48 +03:00
Lauris BH
7c562a6c34 sync to the latest github/linguist v7.11.0 2020-09-17 10:34:41 +03:00
Miguel Molina
78696c2272
data: bailout in some cases if there arent enough lines
Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
2020-05-28 13:39:59 +02:00
Miguel Molina
79398a925d
data: fix getting the first line for empty content
Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
2020-05-28 11:28:49 +02:00
Miguel Molina
8ff885a3a8
implement IsGenerated helper to filter out generated files
Closes #17

Implements the IsGenerated helper function to filter out generated
files using the rules and matchers in:
- https://github.com/github/linguist/blob/master/lib/linguist/generated.rb

Since the vast majority of matchers have very different logic, it cannot
be autogenerated directly from linguist like other logics in enry, so it's
translated by hand.

There are three different types of matchers in this implementation:
- By extension, which mark as generated based only in the extension. These
  are the fastest matchers, so they're done first.
- By file name, which matches patterns against the filename. These
  are performed in second place. Unlike linguist, we try to use string
  functions instead of regexps as much as possible.
- Finally, the rest of the matchers, which go into the content and try
  to identify if they're generated or not based on the content. Unlike
  linguist, we try to only read the content we need and not split it
  all unless it's necessary and use byte functions instead of regexps
  as much as possible.

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
2020-05-28 08:55:13 +02:00
Máximo Cuadros
29bc0a181b
data: replace substring package with regex package 2020-04-15 17:27:48 +02:00
Máximo Cuadros
b851ee83ad
IsTest function for top 10 languages 2020-04-06 16:23:48 +02:00
Lauris BH
97a26011a9 Return group color if language has none 2020-03-31 09:30:27 +03:00
Lauris BH
9030d3671b sync to the latest github/linguist v7.9.0 2020-03-30 01:25:57 +03:00
Alexander Bezzubov
e32a70a784
tokenizer: fix a bug and regenerate the code \w latest Go
See https://github.com/bzz/enry/pull/4 for details.

Test Plan:
 - go test ./...
2020-03-19 19:08:21 +01:00
Máximo Cuadros
84efad7693
*: module rename to go-enry/go-enry/v4 2020-03-19 17:31:29 +01:00
Alexander Bezzubov
bc5e031cee Drop src-d org ref except for issues
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-19 14:04:36 +01:00
Lauris Bukšis-Haberkorns
4e3e15e80d
Sync to linguist v7.5.1
Signed-off-by: Lauris BH <lauris@nix.lv>
2019-08-06 17:18:01 +03:00
Lauris Bukšis-Haberkorns
25b29ebdc4 Implement getting color code for languages
Signed-off-by: Lauris Bukšis-Haberkorns <lauris@nix.lv>
2019-07-19 23:59:46 +03:00