src.checks package

Submodules

src.checks.checked_in_binaries module

Check that determines the file type for every file in the project and compares it to a blacklist of binary executable file formats

class src.checks.checked_in_binaries.FactHelperFileFileType(ft: Dict[str, str])[source]

Bases: FileTypeInterface

Parameters:

ft (Dict[str, str]) –

__init__(ft: Dict[str, str]) None[source]
Parameters:

ft (Dict[str, str]) –

Return type:

None

_key() Tuple[Hashable, ...][source]

Used to decide equality in comparisons and hash based lookups

Return type:

Tuple[Hashable, …]

__module__
class src.checks.checked_in_binaries.FactHelperFile[source]

Bases: FileTypeToolInterface

file_type_of(file: Path) FactHelperFileFileType[source]
Parameters:

file (Path) –

Return type:

FactHelperFileFileType

__module__
class src.checks.checked_in_binaries.CheckedInBinaries(*args, **kwargs)[source]

Bases: CheckInterface

Represents a check that determies the file type for every file in the project and compares it to a blacklist of binary executable file formats

blacklist_dir: Path
exclude: Pattern
blacklist: Set[FileTypeInterface]
whitelist: Set[FileTypeInterface]
fileTypeTools: List[Type[FileTypeToolInterface]]
__init__(*args, **kwargs)[source]
__update_blacklist() None
Return type:

None

__init_blacklist() None
Return type:

None

__is_too_generic(file_type: FileTypeInterface) bool
Parameters:

file_type (FileTypeInterface) –

Return type:

bool

_run_all_tools() None[source]

For each available tool or library the set of detected file types is determined. All files with illegal file types are recorded.

Return type:

None

_format_findings() Dict[str, Dict[str, List[str]]][source]
Return type:

Dict[str, Dict[str, List[str]]]

__format_findings(findings: Dict[FileTypeInterface, List[Path]]) Dict[str, List[str]]
Parameters:

findings (Dict[FileTypeInterface, List[Path]]) –

Return type:

Dict[str, List[str]]

_is_ok(file_type: FileTypeInterface) bool[source]
Parameters:

file_type (FileTypeInterface) –

Return type:

bool

_calc_score() float[source]
Return type:

float

_determine_violations()[source]
run(args_dict: Dict[str, Any] | None = None) Dict[str, Any][source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

Dict[str, Any]

__annotations__
__module__

src.checks.comments_in_code module

Implementation of the “Comments in Code” check

class src.checks.comments_in_code.CommentsInCode(*args: Any, **kwargs: Dict[str, Any])[source]

Bases: CheckInterface

Implementation of the “Comments in Code” check

This check essentially just runs tokei and divides comment lines by combined comment and code lines. There is some additional logic to handle programming languages.

Parameters:
  • args (Any) –

  • kwargs (Dict[str, Any]) –

__init__(*args: Any, **kwargs: Dict[str, Any]) None[source]
Parameters:
  • args (Any) –

  • kwargs (Dict[str, Any]) –

Return type:

None

__load_tokei_to_linguist() Dict[str, str | None]
Return type:

Dict[str, str | None]

__compute_l_check(tokei_to_linguist: Dict[str, str | None]) Set[str]

Compute L_check via relation L_check = Im(A) {None}

Parameters:

tokei_to_linguist (Dict[str, str | None]) –

Return type:

Set[str]

__have_tokei() bool
Return type:

bool

__fetch_linguist() Dict[str, float]
Return type:

Dict[str, float]

__compute_l_repo(l_of_r: Set[str], l_check: Set[str]) Set[str]
Parameters:
  • l_of_r (Set[str]) –

  • l_check (Set[str]) –

Return type:

Set[str]

__run_tokei() Dict[str, Dict[str, int]]
Return type:

Dict[str, Dict[str, int]]

_tokei(lang: str) float[source]

Map that takes a language in L_repo to its comments to code ratio

Parameters:

lang (str) –

Return type:

float

_sigma(value: float) float[source]

Scoring function that receives the average comments to code ratio as an input and maps it to the final score.

Needed since we cannot expect a project to be 100% comments to receive a perfect score

Parameters:

value (float) –

Return type:

float

_compute_tokei() Dict[str, float][source]
Returns:

The computed tokei map

Return type:

Dict[str, float]

_compute_score(lang_ratios: Dict[str, float]) float[source]
Parameters:

lang_ratios (Dict[str, float]) –

Return type:

float

run(args_dict: Dict[str, Any] | None = None) Dict[str, Any][source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

Dict[str, Any]

__annotations__
__module__

src.checks.existence_of_documentation_infrastructure module

This module contains the implementation of the Existence of Documentation Infrastructure check.

The check performs a bunch of heuristics to estimate the amount of documentation that exists for the piece of software developed in a repository.

src.checks.existence_of_documentation_infrastructure.logger: Logger

Def: “Plain in-tree documentation” is defined as software documentation that is directly managed by the git vcs. In particular, no further steps are necessary to obtain the final form of the documentation after a checkout of the repository har been performed.

class src.checks.existence_of_documentation_infrastructure.PlainInTreeFile(repo: Repo, api: Project)[source]

Bases: DocumentationTypeInterface

Def: Plain in-tree documentation is said to be “file” if it is contained within plain text files that are placed in the same (sub)tree as non-documentation related files.

In practice this check does two things: 1. It has a simple whitelist of file names that are automatically considered to contain documentation when they are found in the repository. 2. It searches text files in the repository for links that point to files in the repository itself. It then uses those links to find the files locally.

Finally, the set of all these files (set is deduplicated) is used to compute the amount of documentation.

Parameters:
  • repo (Repo) –

  • api (Project) –

DOC_FILE_NAME_WHITELIST: Set[str]

Generated on local dump of OpenCoDE using (with some manually curation): .. code block:: bash

for f in $(fd -t f -i –regex ‘.(md|txt|rst)$’); do;

b=$(basename “$f” | rg -vi

‘(license|changelog|security|contrib|test|release|conduct)’);

[ ! -z $b ] && echo $b; done | sort | uniq -c | sort -n | tail -n 100

_url_to_file(url: str) Path | None[source]

Tries its very best to convert a url (probably a link to a file in the remote GitLab repository) to a local file in our checkout. It is kinda important to keep in mind that the input is entirely untrusted and path traversal issues must be avoided (even though we would only open the file and report its number of characters).

Parameters:

url (str) – url that points to some file in the remote repo

Returns:

local copy of the file

Return type:

Path | None

Checks for links to documentation that point back to the repository itself, both in the publiccode.yml and in text files. Then it tries to find the respective files locally.

Returns:

Set of local documentation files that were found in that way.

Return type:

Set[Path]

static _doc_file_filter(file_name: str) bool[source]

Used to filter out all non-documentation files when iterating over a repository.

Parameters:

file_name (str) – name of the file to decide

Returns:

True iff file should be skipped

Return type:

bool

delta() Tuple[float, int][source]

Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.

Returns:

confidence into the result, and amount of documentation detected

Return type:

Tuple[float, int]

__annotations__
__module__
class src.checks.existence_of_documentation_infrastructure.PlainInTreeFolder(repo: Repo, api: Project)[source]

Bases: DocumentationTypeInterface

Def: Plain in-tree documentation is said to be “folder” if there exists a subtree that is solely comprised of documentation.

Im practice, we simply look for folders that are named something like *doc*. We then recursively count characters in this subtree (text files only).

Parameters:
  • repo (Repo) –

  • api (Project) –

DOC_FOLDER_RE

used to match directory names that contain documentation

classmethod _doc_dir_predicate(dir_name: str) bool[source]
Parameters:

dir_name (str) – name of the directory to decide

Returns:

True iff the directory name indicates that the directory holds documentation.

Return type:

bool

_find_doc_dirs() Iterable[Path][source]

Returns all subtrees that are likely to hold only documentation.

Return type:

Iterable[Path]

_count_docs() int[source]
Returns:

number of non-whitespace characters in some kinds of text files that live within directories that maybe contain documentation.

Return type:

int

delta() Tuple[float, int][source]

Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.

Returns:

confidence into the result, and amount of documentation detected

Return type:

Tuple[float, int]

__annotations__
__module__
class src.checks.existence_of_documentation_infrastructure.OutOfTreeExternal(repo: Repo, api: Project)[source]

Bases: DocumentationTypeInterface

Def: “External out-of-tree documentation” is defined as any software documentation that can not be generated from the contents of the source code repository of the software and is not integrated into a management software that is wrapping the git repository. For example, this includes manually curated documentation that is hosted on an external website.

In practice, this check does two things: 1. Regex: It searches links with something like “docs” in their preview text within some kinds of text files. It then takes all links that are not pointing at the repository itself and marks them as external documentation (with low confidence). 2. It searches the ‘publiccode.yml’ and checks for keys that point at documentation. If they do not point at the project, it counts them as external documentation with high confidence. As we can not (and don’t want to) scrape the referenced websites in some way, the “amount” of documentation behind an external link is just a hard-coded value.

Parameters:
  • repo (Repo) –

  • api (Project) –

HARD_CODED_SCORE: float

If some docs are found, the amount returned by the delta method will always evaluate to this score.

_get_amount() int[source]

Generates a value for the amount of documentation that was found. Since we do not want to scrape websites this is just some hard-coded value that leads to a score we like.

Returns:

The amount that leads to the hard-coded score

Return type:

int

delta() Tuple[float, int][source]

Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.

Returns:

confidence into the result, and amount of documentation detected

Return type:

Tuple[float, int]

__annotations__
__module__
class src.checks.existence_of_documentation_infrastructure.OutOfTreeWiki(repo: Repo, api: Project)[source]

Bases: DocumentationTypeInterface

Def: “Wiki out-of-tree documentation” is defined as any software documentation that is integrated into the management software that wraps the software’s git repository. In particular, this encompasses documentation that was generated from the source code repository and then made available via this method.

In the case of OpenCoDE, we check the Wiki pages of a project.

API: https://python-gitlab.readthedocs.io/en/stable/gl_objects/wikis.html

Parameters:
  • repo (Repo) –

  • api (Project) –

_fetch_wiki_pages() Dict[str, int][source]
Returns:

mapping of wiki page names to number of non-whitespace characters on the page

Return type:

Dict[str, int]

delta() Tuple[float, int][source]

Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.

Returns:

confidence into the result, and amount of documentation detected

Return type:

Tuple[float, int]

__annotations__
__module__
class src.checks.existence_of_documentation_infrastructure.ExistenceOfDocumentationInfrastructure(proj: Project, repo: Repo, api: Gitlab)[source]

Bases: CheckInterface

Implementation of the Existence of Documentation Infrastructure check.

The class only contains the high-level logic of this check. It computes the mapping delta with the help of the specialized documentation type classes and then performs the score calculation based on that.

Parameters:
  • proj (Project) –

  • repo (Repo) –

  • api (Gitlab) –

DEFAULT_RISE: float

0.5.

doc_types: List[Type[DocumentationTypeInterface]]

Specialized classes for detecting and counting the different kinds of documentation that we defined.

static _sigma(x: int, a: float = 9.967226258835993) float[source]
Parameters:
  • x (int) – amount [0,+infty)

  • a (float) –

Returns:

score in [0,1)

Return type:

float

static sigma_inv(y: float, a: float = 9.967226258835993) float[source]
Parameters:
  • y (float) – score in [0,1)

  • a (float) –

Returns:

amount in [0, +infty)

Return type:

float

static _sigma_inv_2(x: int, y: float) float[source]
Parameters:
  • x (int) – amount in [0, +infty)

  • y (float) – score in [0,1)

Float:

exponent such that y = sigma(x)

Return type:

float

_compute_delta() List[Tuple[str, float, int]][source]
Returns:

Pre-computed mapping delta for the current repository.

Return type:

List[Tuple[str, float, int]]

_score(delta: List[Tuple[str, float, int]]) float[source]
Returns:

The final score of the current repository.

Parameters:

delta (List[Tuple[str, float, int]]) –

Return type:

float

_detailed_results(delta: List[Tuple[str, float, int]]) Dict[str, List[Dict[str, Any]]][source]
Returns:

Check specific results with some details about the different kinds of documentation that were detected.

Parameters:

delta (List[Tuple[str, float, int]]) –

Return type:

Dict[str, List[Dict[str, Any]]]

__annotations__
__module__
run(args_dict: Dict[str, Any] | None = None) Dict[str, Any][source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

Dict[str, Any]

src.checks.interfaces_checked_in_binaries module

class src.checks.interfaces_checked_in_binaries.FileTypeInterface[source]

Bases: object

Represents a recognized file type

__init__()[source]
_key() Tuple[Hashable, ...][source]

Used to decide equality in comparisons and hash based lookups

Return type:

Tuple[Hashable, …]

__repr__() str[source]

Return repr(self).

Return type:

str

__str__() str[source]

Return str(self).

Return type:

str

__hash__() int[source]

Return hash(self).

Return type:

int

__eq__(other) bool[source]

Return self==value.

Return type:

bool

__annotations__
__dict__
__module__
__weakref__

list of weak references to the object (if defined)

class src.checks.interfaces_checked_in_binaries.FileTypeToolInterface[source]

Bases: object

Represents a tool that implements the mapping file -> file_type

__init__()[source]
class property name: str

str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.

__repr__() str[source]

Return repr(self).

Return type:

str

__str__() str[source]

Return str(self).

Return type:

str

file_type_of(file: Path) FileTypeInterface[source]
Parameters:

file (Path) –

Return type:

FileTypeInterface

__annotations__
__dict__
__module__
__weakref__

list of weak references to the object (if defined)

src.checks.interfaces_existence_of_documentation_infrastructure module

This module contains the interfaces and common functionality used by the Existence of Documentation Infrastructure check.

class src.checks.interfaces_existence_of_documentation_infrastructure.DocumentationTypeInterface(repo: Repo, api: Project)[source]

Bases: Named

Abstracts over the different kinds of documentation that a project might have. The business logic for finding and scoring documentation is in the implementing classes, this interface is used by the main check class to compute the final score.

The class also contains some helpers for common operations.

Parameters:
  • repo (Repo) –

  • api (Project) –

TEXT_FILE_REGEX: Pattern[str]

used to filter files that are likely not plain text

used to find markdown links to documentation

Bases: tuple

retuned by methods that collect links to documentation

Parameters:
  • type (str) –

  • url (str) –

__annotations__
__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__match_args__
__module__
static __new__(_cls, type: str, url: str)

Create new instance of PubbliccodeymlDocLink(type, url)

Parameters:
  • type (str) –

  • url (str) –

__repr__()

Return a nicely formatted representation string

__slots__
_asdict()

Return a new dict which maps field names to their values.

_field_defaults
_fields
classmethod _make(iterable)

Make a new PubbliccodeymlDocLink object from a sequence or iterable

_replace(**kwds)

Return a new PubbliccodeymlDocLink object replacing specified fields with new values

type: str

Alias for field number 0

url: str

Alias for field number 1

Bases: tuple

Parameters:
  • file (str) –

  • preview (str) –

  • url (str) –

__annotations__
__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__match_args__
__module__
static __new__(_cls, file: str, preview: str, url: str)

Create new instance of ScrapedDocLink(file, preview, url)

Parameters:
  • file (str) –

  • preview (str) –

  • url (str) –

__repr__()

Return a nicely formatted representation string

__slots__
_asdict()

Return a new dict which maps field names to their values.

_field_defaults
_fields
classmethod _make(iterable)

Make a new ScrapedDocLink object from a sequence or iterable

_replace(**kwds)

Return a new ScrapedDocLink object replacing specified fields with new values

file: str

Alias for field number 0

preview: str

Alias for field number 1

url: str

Alias for field number 2

RM_WHITESPACE_MAP: Dict[int, Literal[None]]
__init__(repo: Repo, api: Project) None[source]
Parameters:
  • repo (Repo) –

  • api (Project) –

Return type:

None

_is_external_url(url: str | None) bool[source]

Checks if a link points to a target outside of OpenCoDE.

Parameters:

url (str | None) – url to decide

Returns:

True iff the url does not point to OpenCoDE

Return type:

bool

_docs_in_publiccodeyml(only_external: bool = False, only_internal: bool = False) List[PubbliccodeymlDocLink][source]

Checks if the publiccode.yaml exists, and if it does, whether it contains links to documentation. Optionally returns only links that point back to the project itself, or only links that point to an URL outside of OpenCoDE.

Returns:

Tuples of (documentation type, link target) for all doc links that were found.

Parameters:
  • only_external (bool) –

  • only_internal (bool) –

Return type:

List[PubbliccodeymlDocLink]

Scans some kinds of text files in the repository for links that have something like *docs* in the preview text. Optionally returns only links that point back to the project itself, or only links that point to an URL outside of OpenCoDE.

Para only_external:

return only links that point to a location outside of OpenCoDE

Parameters:
  • only_internal (bool) – return only links that point back to the project itself

  • only_external (bool) –

Returns:

Tuples of (file name, link preview text, link target) for all doc links that were found.

Return type:

List[ScrapedDocLink]

_amount(files: Iterable[Path]) int[source]
Returns:

Returns total number on non-whitespace characters in files.

Parameters:

files (Iterable[Path]) –

Return type:

int

classmethod _text_file_filter(file_name: str) bool[source]
Parameters:

file_name (str) –

Return type:

bool

_get_publiccodeyml() Dict[str, Any] | None[source]

Try to find and parse the projects publiccode.yaml.

Returns:

a mapping that contains the parsed file

Return type:

Dict[str, Any] | None

_remove_whitespace(s: str) str[source]
Returns:

input string with all non-whitespace characters removed

Parameters:

s (str) –

Return type:

str

delta() Tuple[float, int][source]

Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.

Returns:

confidence into the result, and amount of documentation detected

Return type:

Tuple[float, int]

__annotations__
__module__

src.checks.interfaces_sast_usage_basic module

src.checks.interfaces_secrets module

class src.checks.interfaces_secrets.SecretInterface[source]

Bases: object

Represents a single secret that was found by some tool

__init__()[source]
summarize() dict[source]
Return type:

dict

__dict__
__module__
__weakref__

list of weak references to the object (if defined)

class src.checks.interfaces_secrets.SecretsToolInterface[source]

Bases: object

Represents a tool that can be applied to a project in order to discover secrets

__init__()[source]
class property name

str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.

check_file(f: Path) None[source]

Checks the file ‘f’ for secrets and adds any found secrets to the internal state of the tool.

Parameters:

f (Path) –

Return type:

None

check_files(files: Iterable[Path]) None[source]

Checks the ‘files’ for secrets and adds any found secrets to the internal state of the tool.

Parameters:

files (Iterable[Path]) –

Return type:

None

create_or_overwrite_baseline(project_id: int)[source]

Creates a new, or overwites an existing, baseline using the tool’s internal state

Parameters:

project_id (int) –

update_baseline(project_id: int) None[source]

Uses the tool’s internal state to update an existing or create a new baseline on disk. Essentially performs an intersection between the internal state and the baseline if there is one, else it just writes the internal state to disk.

Important: This method might change the tool’s internal state.

Parameters:

project_id (int) –

Return type:

None

diff_vs_baseline(project_id: int) Iterable[SecretInterface][source]

The list of all secrets the tool found in a project that are not in the baseline.

Parameters:

project_id (int) –

Return type:

Iterable[SecretInterface]

delete_baseline(project_id: int) None[source]

Removes any peristed baselines this tool has for the given project. No-op if there is no baseline.

Parameters:

project_id (int) –

Return type:

None

property detected_secrets: Iterable[SecretInterface]

The list of all secrets the tool found in a project. Essentially the tools internal state.

__dict__
__module__
__weakref__

list of weak references to the object (if defined)

src.checks.sast_usage_basic module

Implementation of the SastUsageBasic check

class src.checks.sast_usage_basic.SastToolKind(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Enumerates the different classes of SAST tools that we differentiate between in our check. Each SastTool has one and it determines “how good” it is if we detect it in a project.

LINTER
SECURITY
SECRET
SCA
classmethod weight(kind: SastToolKind) float[source]

Encodes “how good” it is if we detect the tool in a project.

Parameters:

kind (SastToolKind) –

Return type:

float

__module__
class src.checks.sast_usage_basic.SastTool(tool_json: Dict[str, Any])[source]

Bases: object

Represents a tool that we can hope to detect in a project.

Parameters:

tool_json (Dict[str, Any]) –

default_special_regex_values: Dict[str, Pattern[str]]
default_language_regex_values: Dict[str, str]
__init__(tool_json: Dict[str, Any])[source]
Parameters:

tool_json (Dict[str, Any]) – A map describing the tool; can be accessed by indexing the instance; domain is defined by the tool JSON schema

__add_source_file_regex(tool_json: Dict[str, Any]) Dict[str, Any]

Adds a source_file_regex key to the input map and populates the value with a regex that should match the names of source files of languages the input maps’ languages array.

Returns:

The updated map

Parameters:

tool_json (Dict[str, Any]) –

Return type:

Dict[str, Any]

__compile_regex(tool_json: Dict[str, Any]) Dict[str, Any]

Replaces string values in the input dict that represent regular expressions with their compiled versions.

Returns:

The updated dict

Parameters:

tool_json (Dict[str, Any]) –

Return type:

Dict[str, Any]

__getitem__(index: str) Any[source]
Parameters:

index (str) –

Return type:

Any

classmethod from_file_validate(schema: Dict[str, Any], file: Path) SastTool[source]

Constructs an instance from a JSON file describing a tool. Validates the file against the expected schema before using it.

Parameters:
  • schema (Dict[str, Any]) –

  • file (Path) –

Return type:

SastTool

check_file(f: Path) bool[source]
Returns:

True iff the file ‘f’ indicates that the SAST tool is being used in the project

Parameters:

f (Path) –

Return type:

bool

_check_file(f: Path, name_regex: Pattern[str] | None, content_regex: Pattern[str] | None = None) bool[source]
Returns:

True iff a file’s name and contents match the respective regular expressions; if no content regex is supplied only the filename is checked

Parameters:
  • f (Path) –

  • name_regex (Pattern[str] | None) –

  • content_regex (Pattern[str] | None) –

Return type:

bool

property weight: float

Not all tools are equally good. Here we supply a rather arbitrary weight to influence how much effect the presence of the tool has on the final score. The higher the weight, the more I like the tool.

Returns:

The weight

__annotations__
__dict__
__module__
__weakref__

list of weak references to the object (if defined)

class src.checks.sast_usage_basic.SastUsageBasic(*args: Any, **kwargs: Dict[str, Any])[source]

Bases: CheckInterface

Parameters:
  • args (Any) –

  • kwargs (Dict[str, Any]) –

exclude: re.Pattern[str]
__init__(*args: Any, **kwargs: Dict[str, Any]) None[source]
Parameters:
  • args (Any) –

  • kwargs (Dict[str, Any]) –

Return type:

None

__load_tool_schema() Dict[str, Any]

Loads the JSON schema of the tool definitions from permanent storage. config: tool_schema

Returns:

JSON schema of a single tool

Return type:

Dict[str, Any]

__generate_tools() None

Generates the individual JSON tool definitions from a CSV file that describes all of them.

config: tools_csv

effect: populates the directory config::tools_dir

note: no-op if directory is not empty

Return type:

None

__load_tools() List[SastTool]

Loads the JSON tool definitions from permanent storage.

config: tools_dir

Returns:

list of tools

Return type:

List[SastTool]

__build_lang_tools() Dict[str, List[SastTool]]

Constructs a mapping from programming languages to tools :return: The mapping

Return type:

Dict[str, List[SastTool]]

_detect_sast_tools() Dict[str, List[SastTool]][source]

Performs the actual “analysis”. Builds map that takes programming languages to the set of SAST tools that the project uses for this language.

Returns:

The mapping

Return type:

Dict[str, List[SastTool]]

_calc_score(detected_tools: Dict[str, List[SastTool]]) float[source]

Consumes the result of`_detect_sast_tools` and calculates the final score out of it.

Returns:

score

Parameters:

detected_tools (Dict[str, List[SastTool]]) –

Return type:

float

run(args_dict: Dict[str, Any] | None = None) Dict[str, Any][source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

Dict[str, Any]

__annotations__
__module__

src.checks.secrets module

Implementation of the Secrets check, which attempts to find leaked secrets in a git repository. It is really just running a bunch of open source tools and collecting their output.

class src.checks.secrets.DetectSecretsSecret(potential_secret: PotentialSecret)[source]

Bases: SecretInterface

Parameters:

potential_secret (PotentialSecret) –

__init__(potential_secret: PotentialSecret)[source]
Parameters:

potential_secret (PotentialSecret) –

summarize() dict[source]
Return type:

dict

__annotations__
__module__
class src.checks.secrets.DetectSecrets[source]

Bases: SecretsToolInterface

__init__() None[source]
Return type:

None

check_file(f: Path) None[source]

Checks the file ‘f’ for secrets and adds any found secrets to the internal state of the tool.

Parameters:

f (Path) –

Return type:

None

_get_baseline_dir(project_id: int) Path[source]
Parameters:

project_id (int) –

Return type:

Path

_get_baseline_file(project_id: int) Path[source]
Parameters:

project_id (int) –

Return type:

Path

maybe_load_baseline(project_id: int) SecretsCollection | None[source]
Parameters:

project_id (int) –

Return type:

SecretsCollection | None

create_or_overwrite_baseline(project_id: int) None[source]

Creates a new, or overwites an existing, baseline using the tool’s internal state

Parameters:

project_id (int) –

Return type:

None

update_baseline(project_id: int) None[source]

Uses the tool’s internal state to update an existing or create a new baseline on disk. Essentially performs an intersection between the internal state and the baseline if there is one, else it just writes the internal state to disk.

Important: This method might change the tool’s internal state.

Parameters:

project_id (int) –

Return type:

None

diff_vs_baseline(project_id: int) Iterable[SecretInterface][source]

The list of all secrets the tool found in a project that are not in the baseline.

Parameters:

project_id (int) –

Return type:

Iterable[SecretInterface]

delete_baseline(project_id: int) None[source]

Removes any peristed baselines this tool has for the given project. No-op if there is no baseline.

Parameters:

project_id (int) –

Return type:

None

check_files(files: Iterable[Path]) None[source]

Checks the ‘files’ for secrets and adds any found secrets to the internal state of the tool.

Parameters:

files (Iterable[Path]) –

Return type:

None

property detected_secrets: Generator[SecretInterface, None, None]

The list of all secrets the tool found in a project. Essentially the tools internal state.

__annotations__
__module__
class src.checks.secrets.Secrets(proj: Project, repo: Repo, api: Gitlab)[source]

Bases: CheckInterface

Class which represents a check that runs a bunch of secret detection tools against a given project and spits out a ‘score’.

Parameters:
  • proj (Project) –

  • repo (Repo) –

  • api (Gitlab) –

exclude: Pattern
secretsTools: List[Type[SecretsToolInterface]]
_detect_secrets() Dict[str, Iterable[SecretInterface]][source]

Generates the set of results that are not in the baseline, i.e, ‘R (R cup B)’, for each tool. Returns the union ‘V’ of these sets. Also updates or creates baselines along the way.

Return type:

Dict[str, Iterable[SecretInterface]]

_calc_score(detected_secrets: Dict[str, Iterable[SecretInterface]]) float[source]
Parameters:

detected_secrets (Dict[str, Iterable[SecretInterface]]) –

Return type:

float

_process_args(args_dict: Dict[str, Any] | None) None[source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

None

run(args_dict: Dict[str, Any] | None = None) Dict[str, Any][source]
Parameters:

args_dict (Dict[str, Any] | None) –

Return type:

Dict[str, Any]

__annotations__
__module__
src.checks.secrets._custom_settings() Iterator[Settings][source]
Return type:

Iterator[Settings]

Module contents

Module that allows to instantiate any selection of available checks on a project

src.checks.results_valid(results: Dict[str, Any]) bool[source]

Validates the results that a check instance’s run method returned.

Parameters:

results (Dict[str, Any]) –

Return type:

bool

src.checks.validate_args(args_dict: Dict[str, Any]) bool[source]

Validates the set of arguments passed to the ‘check’ subcommand

Parameters:

args_dict (Dict[str, Any]) –

Return type:

bool

src.checks.transform_args(args_dict: Dict[str, Any]) Tuple[Path | None, int | None][source]

Transforms the set of arguments passed to the check subcommand

Parameters:

args_dict (Dict[str, Any]) –

Return type:

Tuple[Path | None, int | None]

src.checks.iter_checks(proj: ~gitlab.v4.objects.projects.Project, repo: ~git.repo.base.Repo, api: ~gitlab.client.Gitlab, filter_func: ~typing.Callable[[~typing.Type[~src.interfaces.CheckInterface]], bool] = <function <lambda>>) Iterable[CheckInterface][source]

This method is used to get instances of all checks in the set filter_func(availableChecks) for a single project. yields: check instances for the given project

Parameters:
  • proj (Project) –

  • repo (Repo) –

  • api (Gitlab) –

  • filter_func (Callable[[Type[CheckInterface]], bool]) –

Return type:

Iterable[CheckInterface]