src.checks package¶
Submodules¶
src.checks.checked_in_binaries module¶
Check that determines the file type for every file in the project and compares it to a blacklist of binary executable file formats
- class src.checks.checked_in_binaries.FactHelperFileFileType(ft: Dict[str, str])[source]¶
Bases:
FileTypeInterface
- Parameters:
ft (Dict[str, str]) –
- _key() Tuple[Hashable, ...] [source]¶
Used to decide equality in comparisons and hash based lookups
- Return type:
Tuple[Hashable, …]
- __module__¶
- class src.checks.checked_in_binaries.FactHelperFile[source]¶
Bases:
FileTypeToolInterface
- file_type_of(file: Path) FactHelperFileFileType [source]¶
- Parameters:
file (Path) –
- Return type:
- __module__¶
- class src.checks.checked_in_binaries.CheckedInBinaries(*args, **kwargs)[source]¶
Bases:
CheckInterface
Represents a check that determies the file type for every file in the project and compares it to a blacklist of binary executable file formats
- blacklist_dir: Path¶
- exclude: Pattern¶
- blacklist: Set[FileTypeInterface]¶
- whitelist: Set[FileTypeInterface]¶
- fileTypeTools: List[Type[FileTypeToolInterface]]¶
- __update_blacklist() None ¶
- Return type:
None
- __init_blacklist() None ¶
- Return type:
None
- __is_too_generic(file_type: FileTypeInterface) bool ¶
- Parameters:
file_type (FileTypeInterface) –
- Return type:
bool
- _run_all_tools() None [source]¶
For each available tool or library the set of detected file types is determined. All files with illegal file types are recorded.
- Return type:
None
- _format_findings() Dict[str, Dict[str, List[str]]] [source]¶
- Return type:
Dict[str, Dict[str, List[str]]]
- __format_findings(findings: Dict[FileTypeInterface, List[Path]]) Dict[str, List[str]] ¶
- Parameters:
findings (Dict[FileTypeInterface, List[Path]]) –
- Return type:
Dict[str, List[str]]
- _is_ok(file_type: FileTypeInterface) bool [source]¶
- Parameters:
file_type (FileTypeInterface) –
- Return type:
bool
- run(args_dict: Dict[str, Any] | None = None) Dict[str, Any] [source]¶
- Parameters:
args_dict (Dict[str, Any] | None) –
- Return type:
Dict[str, Any]
- __annotations__¶
- __module__¶
src.checks.comments_in_code module¶
Implementation of the “Comments in Code” check
- class src.checks.comments_in_code.CommentsInCode(*args: Any, **kwargs: Dict[str, Any])[source]¶
Bases:
CheckInterface
Implementation of the “Comments in Code” check
This check essentially just runs tokei and divides comment lines by combined comment and code lines. There is some additional logic to handle programming languages.
- Parameters:
args (Any) –
kwargs (Dict[str, Any]) –
- __init__(*args: Any, **kwargs: Dict[str, Any]) None [source]¶
- Parameters:
args (Any) –
kwargs (Dict[str, Any]) –
- Return type:
None
- __load_tokei_to_linguist() Dict[str, str | None] ¶
- Return type:
Dict[str, str | None]
- __compute_l_check(tokei_to_linguist: Dict[str, str | None]) Set[str] ¶
Compute L_check via relation L_check = Im(A) {None}
- Parameters:
tokei_to_linguist (Dict[str, str | None]) –
- Return type:
Set[str]
- __have_tokei() bool ¶
- Return type:
bool
- __fetch_linguist() Dict[str, float] ¶
- Return type:
Dict[str, float]
- __compute_l_repo(l_of_r: Set[str], l_check: Set[str]) Set[str] ¶
- Parameters:
l_of_r (Set[str]) –
l_check (Set[str]) –
- Return type:
Set[str]
- __run_tokei() Dict[str, Dict[str, int]] ¶
- Return type:
Dict[str, Dict[str, int]]
- _tokei(lang: str) float [source]¶
Map that takes a language in L_repo to its comments to code ratio
- Parameters:
lang (str) –
- Return type:
float
- _sigma(value: float) float [source]¶
Scoring function that receives the average comments to code ratio as an input and maps it to the final score.
Needed since we cannot expect a project to be 100% comments to receive a perfect score
- Parameters:
value (float) –
- Return type:
float
- _compute_tokei() Dict[str, float] [source]¶
- Returns:
The computed tokei map
- Return type:
Dict[str, float]
- _compute_score(lang_ratios: Dict[str, float]) float [source]¶
- Parameters:
lang_ratios (Dict[str, float]) –
- Return type:
float
- run(args_dict: Dict[str, Any] | None = None) Dict[str, Any] [source]¶
- Parameters:
args_dict (Dict[str, Any] | None) –
- Return type:
Dict[str, Any]
- __annotations__¶
- __module__¶
src.checks.existence_of_documentation_infrastructure module¶
This module contains the implementation of the Existence of Documentation Infrastructure check.
The check performs a bunch of heuristics to estimate the amount of documentation that exists for the piece of software developed in a repository.
- src.checks.existence_of_documentation_infrastructure.logger: Logger¶
Def: “Plain in-tree documentation” is defined as software documentation that is directly managed by the git vcs. In particular, no further steps are necessary to obtain the final form of the documentation after a checkout of the repository har been performed.
- class src.checks.existence_of_documentation_infrastructure.PlainInTreeFile(repo: Repo, api: Project)[source]¶
Bases:
DocumentationTypeInterface
Def: Plain in-tree documentation is said to be “file” if it is contained within plain text files that are placed in the same (sub)tree as non-documentation related files.
In practice this check does two things: 1. It has a simple whitelist of file names that are automatically considered to contain documentation when they are found in the repository. 2. It searches text files in the repository for links that point to files in the repository itself. It then uses those links to find the files locally.
Finally, the set of all these files (set is deduplicated) is used to compute the amount of documentation.
- Parameters:
repo (Repo) –
api (Project) –
- DOC_FILE_NAME_WHITELIST: Set[str]¶
Generated on local dump of OpenCoDE using (with some manually curation): .. code block:: bash
- for f in $(fd -t f -i –regex ‘.(md|txt|rst)$’); do;
b=$(basename “$f” | rg -vi
‘(license|changelog|security|contrib|test|release|conduct)’);
[ ! -z $b ] && echo $b; done | sort | uniq -c | sort -n | tail -n 100
- _url_to_file(url: str) Path | None [source]¶
Tries its very best to convert a url (probably a link to a file in the remote GitLab repository) to a local file in our checkout. It is kinda important to keep in mind that the input is entirely untrusted and path traversal issues must be avoided (even though we would only open the file and report its number of characters).
- Parameters:
url (str) – url that points to some file in the remote repo
- Returns:
local copy of the file
- Return type:
Path | None
- _find_doc_files_from_links() Set[Path] [source]¶
Checks for links to documentation that point back to the repository itself, both in the publiccode.yml and in text files. Then it tries to find the respective files locally.
- Returns:
Set of local documentation files that were found in that way.
- Return type:
Set[Path]
- static _doc_file_filter(file_name: str) bool [source]¶
Used to filter out all non-documentation files when iterating over a repository.
- Parameters:
file_name (str) – name of the file to decide
- Returns:
True iff file should be skipped
- Return type:
bool
- delta() Tuple[float, int] [source]¶
Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.
- Returns:
confidence into the result, and amount of documentation detected
- Return type:
Tuple[float, int]
- __annotations__¶
- __module__¶
- class src.checks.existence_of_documentation_infrastructure.PlainInTreeFolder(repo: Repo, api: Project)[source]¶
Bases:
DocumentationTypeInterface
Def: Plain in-tree documentation is said to be “folder” if there exists a subtree that is solely comprised of documentation.
Im practice, we simply look for folders that are named something like *doc*. We then recursively count characters in this subtree (text files only).
- Parameters:
repo (Repo) –
api (Project) –
- DOC_FOLDER_RE¶
used to match directory names that contain documentation
- classmethod _doc_dir_predicate(dir_name: str) bool [source]¶
- Parameters:
dir_name (str) – name of the directory to decide
- Returns:
True iff the directory name indicates that the directory holds documentation.
- Return type:
bool
- _find_doc_dirs() Iterable[Path] [source]¶
Returns all subtrees that are likely to hold only documentation.
- Return type:
Iterable[Path]
- _count_docs() int [source]¶
- Returns:
number of non-whitespace characters in some kinds of text files that live within directories that maybe contain documentation.
- Return type:
int
- delta() Tuple[float, int] [source]¶
Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.
- Returns:
confidence into the result, and amount of documentation detected
- Return type:
Tuple[float, int]
- __annotations__¶
- __module__¶
- class src.checks.existence_of_documentation_infrastructure.OutOfTreeExternal(repo: Repo, api: Project)[source]¶
Bases:
DocumentationTypeInterface
Def: “External out-of-tree documentation” is defined as any software documentation that can not be generated from the contents of the source code repository of the software and is not integrated into a management software that is wrapping the git repository. For example, this includes manually curated documentation that is hosted on an external website.
In practice, this check does two things: 1. Regex: It searches links with something like “docs” in their preview text within some kinds of text files. It then takes all links that are not pointing at the repository itself and marks them as external documentation (with low confidence). 2. It searches the ‘publiccode.yml’ and checks for keys that point at documentation. If they do not point at the project, it counts them as external documentation with high confidence. As we can not (and don’t want to) scrape the referenced websites in some way, the “amount” of documentation behind an external link is just a hard-coded value.
- Parameters:
repo (Repo) –
api (Project) –
- HARD_CODED_SCORE: float¶
If some docs are found, the amount returned by the delta method will always evaluate to this score.
- _get_amount() int [source]¶
Generates a value for the amount of documentation that was found. Since we do not want to scrape websites this is just some hard-coded value that leads to a score we like.
- Returns:
The amount that leads to the hard-coded score
- Return type:
int
- delta() Tuple[float, int] [source]¶
Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.
- Returns:
confidence into the result, and amount of documentation detected
- Return type:
Tuple[float, int]
- __annotations__¶
- __module__¶
- class src.checks.existence_of_documentation_infrastructure.OutOfTreeWiki(repo: Repo, api: Project)[source]¶
Bases:
DocumentationTypeInterface
Def: “Wiki out-of-tree documentation” is defined as any software documentation that is integrated into the management software that wraps the software’s git repository. In particular, this encompasses documentation that was generated from the source code repository and then made available via this method.
In the case of OpenCoDE, we check the Wiki pages of a project.
API: https://python-gitlab.readthedocs.io/en/stable/gl_objects/wikis.html
- Parameters:
repo (Repo) –
api (Project) –
- _fetch_wiki_pages() Dict[str, int] [source]¶
- Returns:
mapping of wiki page names to number of non-whitespace characters on the page
- Return type:
Dict[str, int]
- delta() Tuple[float, int] [source]¶
Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.
- Returns:
confidence into the result, and amount of documentation detected
- Return type:
Tuple[float, int]
- __annotations__¶
- __module__¶
- class src.checks.existence_of_documentation_infrastructure.ExistenceOfDocumentationInfrastructure(proj: Project, repo: Repo, api: Gitlab)[source]¶
Bases:
CheckInterface
Implementation of the Existence of Documentation Infrastructure check.
The class only contains the high-level logic of this check. It computes the mapping delta with the help of the specialized documentation type classes and then performs the score calculation based on that.
- Parameters:
proj (Project) –
repo (Repo) –
api (Gitlab) –
- DEFAULT_RISE: float¶
0.5.
- doc_types: List[Type[DocumentationTypeInterface]]¶
Specialized classes for detecting and counting the different kinds of documentation that we defined.
- static _sigma(x: int, a: float = 9.967226258835993) float [source]¶
- Parameters:
x (int) – amount [0,+infty)
a (float) –
- Returns:
score in [0,1)
- Return type:
float
- static sigma_inv(y: float, a: float = 9.967226258835993) float [source]¶
- Parameters:
y (float) – score in [0,1)
a (float) –
- Returns:
amount in [0, +infty)
- Return type:
float
- static _sigma_inv_2(x: int, y: float) float [source]¶
- Parameters:
x (int) – amount in [0, +infty)
y (float) – score in [0,1)
- Float:
exponent such that y = sigma(x)
- Return type:
float
- _compute_delta() List[Tuple[str, float, int]] [source]¶
- Returns:
Pre-computed mapping delta for the current repository.
- Return type:
List[Tuple[str, float, int]]
- _score(delta: List[Tuple[str, float, int]]) float [source]¶
- Returns:
The final score of the current repository.
- Parameters:
delta (List[Tuple[str, float, int]]) –
- Return type:
float
- _detailed_results(delta: List[Tuple[str, float, int]]) Dict[str, List[Dict[str, Any]]] [source]¶
- Returns:
Check specific results with some details about the different kinds of documentation that were detected.
- Parameters:
delta (List[Tuple[str, float, int]]) –
- Return type:
Dict[str, List[Dict[str, Any]]]
- __annotations__¶
- __module__¶
src.checks.interfaces_checked_in_binaries module¶
- class src.checks.interfaces_checked_in_binaries.FileTypeInterface[source]¶
Bases:
object
Represents a recognized file type
- _key() Tuple[Hashable, ...] [source]¶
Used to decide equality in comparisons and hash based lookups
- Return type:
Tuple[Hashable, …]
- __annotations__¶
- __dict__¶
- __module__¶
- __weakref__¶
list of weak references to the object (if defined)
- class src.checks.interfaces_checked_in_binaries.FileTypeToolInterface[source]¶
Bases:
object
Represents a tool that implements the mapping file -> file_type
- class property name: str¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.
- file_type_of(file: Path) FileTypeInterface [source]¶
- Parameters:
file (Path) –
- Return type:
- __annotations__¶
- __dict__¶
- __module__¶
- __weakref__¶
list of weak references to the object (if defined)
src.checks.interfaces_existence_of_documentation_infrastructure module¶
This module contains the interfaces and common functionality used by the Existence of Documentation Infrastructure check.
- class src.checks.interfaces_existence_of_documentation_infrastructure.DocumentationTypeInterface(repo: Repo, api: Project)[source]¶
Bases:
Named
Abstracts over the different kinds of documentation that a project might have. The business logic for finding and scoring documentation is in the implementing classes, this interface is used by the main check class to compute the final score.
The class also contains some helpers for common operations.
- Parameters:
repo (Repo) –
api (Project) –
- TEXT_FILE_REGEX: Pattern[str]¶
used to filter files that are likely not plain text
- LINK_PATTERN: Pattern[str]¶
used to find markdown links to documentation
- class PubbliccodeymlDocLink(type, url)¶
Bases:
tuple
retuned by methods that collect links to documentation
- Parameters:
type (str) –
url (str) –
- __annotations__¶
- __getnewargs__()¶
Return self as a plain tuple. Used by copy and pickle.
- __match_args__¶
- __module__¶
- static __new__(_cls, type: str, url: str)¶
Create new instance of PubbliccodeymlDocLink(type, url)
- Parameters:
type (str) –
url (str) –
- __repr__()¶
Return a nicely formatted representation string
- __slots__¶
- _asdict()¶
Return a new dict which maps field names to their values.
- _field_defaults¶
- _fields¶
- classmethod _make(iterable)¶
Make a new PubbliccodeymlDocLink object from a sequence or iterable
- _replace(**kwds)¶
Return a new PubbliccodeymlDocLink object replacing specified fields with new values
- type: str¶
Alias for field number 0
- url: str¶
Alias for field number 1
- class ScrapedDocLink(file, preview, url)¶
Bases:
tuple
- Parameters:
file (str) –
preview (str) –
url (str) –
- __annotations__¶
- __getnewargs__()¶
Return self as a plain tuple. Used by copy and pickle.
- __match_args__¶
- __module__¶
- static __new__(_cls, file: str, preview: str, url: str)¶
Create new instance of ScrapedDocLink(file, preview, url)
- Parameters:
file (str) –
preview (str) –
url (str) –
- __repr__()¶
Return a nicely formatted representation string
- __slots__¶
- _asdict()¶
Return a new dict which maps field names to their values.
- _field_defaults¶
- _fields¶
- classmethod _make(iterable)¶
Make a new ScrapedDocLink object from a sequence or iterable
- _replace(**kwds)¶
Return a new ScrapedDocLink object replacing specified fields with new values
- file: str¶
Alias for field number 0
- preview: str¶
Alias for field number 1
- url: str¶
Alias for field number 2
- RM_WHITESPACE_MAP: Dict[int, Literal[None]]¶
- __init__(repo: Repo, api: Project) None [source]¶
- Parameters:
repo (Repo) –
api (Project) –
- Return type:
None
- _is_external_url(url: str | None) bool [source]¶
Checks if a link points to a target outside of OpenCoDE.
- Parameters:
url (str | None) – url to decide
- Returns:
True iff the url does not point to OpenCoDE
- Return type:
bool
- _docs_in_publiccodeyml(only_external: bool = False, only_internal: bool = False) List[PubbliccodeymlDocLink] [source]¶
Checks if the publiccode.yaml exists, and if it does, whether it contains links to documentation. Optionally returns only links that point back to the project itself, or only links that point to an URL outside of OpenCoDE.
- Returns:
Tuples of (documentation type, link target) for all doc links that were found.
- Parameters:
only_external (bool) –
only_internal (bool) –
- Return type:
List[PubbliccodeymlDocLink]
- _collect_doc_links(only_external: bool = False, only_internal: bool = False) List[ScrapedDocLink] [source]¶
Scans some kinds of text files in the repository for links that have something like *docs* in the preview text. Optionally returns only links that point back to the project itself, or only links that point to an URL outside of OpenCoDE.
- Para only_external:
return only links that point to a location outside of OpenCoDE
- Parameters:
only_internal (bool) – return only links that point back to the project itself
only_external (bool) –
- Returns:
Tuples of (file name, link preview text, link target) for all doc links that were found.
- Return type:
List[ScrapedDocLink]
- _amount(files: Iterable[Path]) int [source]¶
- Returns:
Returns total number on non-whitespace characters in files.
- Parameters:
files (Iterable[Path]) –
- Return type:
int
- classmethod _text_file_filter(file_name: str) bool [source]¶
- Parameters:
file_name (str) –
- Return type:
bool
- _get_publiccodeyml() Dict[str, Any] | None [source]¶
Try to find and parse the projects publiccode.yaml.
- Returns:
a mapping that contains the parsed file
- Return type:
Dict[str, Any] | None
- _remove_whitespace(s: str) str [source]¶
- Returns:
input string with all non-whitespace characters removed
- Parameters:
s (str) –
- Return type:
str
- delta() Tuple[float, int] [source]¶
Restriction of the delta map to the documentation type represented by the implementor and the repository specified during the construction of this instance.
- Returns:
confidence into the result, and amount of documentation detected
- Return type:
Tuple[float, int]
- __annotations__¶
- __module__¶
src.checks.interfaces_sast_usage_basic module¶
src.checks.interfaces_secrets module¶
- class src.checks.interfaces_secrets.SecretInterface[source]¶
Bases:
object
Represents a single secret that was found by some tool
- __dict__¶
- __module__¶
- __weakref__¶
list of weak references to the object (if defined)
- class src.checks.interfaces_secrets.SecretsToolInterface[source]¶
Bases:
object
Represents a tool that can be applied to a project in order to discover secrets
- class property name¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.
- check_file(f: Path) None [source]¶
Checks the file ‘f’ for secrets and adds any found secrets to the internal state of the tool.
- Parameters:
f (Path) –
- Return type:
None
- check_files(files: Iterable[Path]) None [source]¶
Checks the ‘files’ for secrets and adds any found secrets to the internal state of the tool.
- Parameters:
files (Iterable[Path]) –
- Return type:
None
- create_or_overwrite_baseline(project_id: int)[source]¶
Creates a new, or overwites an existing, baseline using the tool’s internal state
- Parameters:
project_id (int) –
- update_baseline(project_id: int) None [source]¶
Uses the tool’s internal state to update an existing or create a new baseline on disk. Essentially performs an intersection between the internal state and the baseline if there is one, else it just writes the internal state to disk.
Important: This method might change the tool’s internal state.
- Parameters:
project_id (int) –
- Return type:
None
- diff_vs_baseline(project_id: int) Iterable[SecretInterface] [source]¶
The list of all secrets the tool found in a project that are not in the baseline.
- Parameters:
project_id (int) –
- Return type:
Iterable[SecretInterface]
- delete_baseline(project_id: int) None [source]¶
Removes any peristed baselines this tool has for the given project. No-op if there is no baseline.
- Parameters:
project_id (int) –
- Return type:
None
- property detected_secrets: Iterable[SecretInterface]¶
The list of all secrets the tool found in a project. Essentially the tools internal state.
- __dict__¶
- __module__¶
- __weakref__¶
list of weak references to the object (if defined)
src.checks.sast_usage_basic module¶
Implementation of the SastUsageBasic check
- class src.checks.sast_usage_basic.SastToolKind(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
Enum
Enumerates the different classes of SAST tools that we differentiate between in our check. Each SastTool has one and it determines “how good” it is if we detect it in a project.
- LINTER¶
- SECURITY¶
- SECRET¶
- SCA¶
- classmethod weight(kind: SastToolKind) float [source]¶
Encodes “how good” it is if we detect the tool in a project.
- Parameters:
kind (SastToolKind) –
- Return type:
float
- __module__¶
- class src.checks.sast_usage_basic.SastTool(tool_json: Dict[str, Any])[source]¶
Bases:
object
Represents a tool that we can hope to detect in a project.
- Parameters:
tool_json (Dict[str, Any]) –
- default_special_regex_values: Dict[str, Pattern[str]]¶
- default_language_regex_values: Dict[str, str]¶
- __init__(tool_json: Dict[str, Any])[source]¶
- Parameters:
tool_json (Dict[str, Any]) – A map describing the tool; can be accessed by indexing the instance; domain is defined by the tool JSON schema
- __add_source_file_regex(tool_json: Dict[str, Any]) Dict[str, Any] ¶
Adds a source_file_regex key to the input map and populates the value with a regex that should match the names of source files of languages the input maps’ languages array.
- Returns:
The updated map
- Parameters:
tool_json (Dict[str, Any]) –
- Return type:
Dict[str, Any]
- __compile_regex(tool_json: Dict[str, Any]) Dict[str, Any] ¶
Replaces string values in the input dict that represent regular expressions with their compiled versions.
- Returns:
The updated dict
- Parameters:
tool_json (Dict[str, Any]) –
- Return type:
Dict[str, Any]
- classmethod from_file_validate(schema: Dict[str, Any], file: Path) SastTool [source]¶
Constructs an instance from a JSON file describing a tool. Validates the file against the expected schema before using it.
- Parameters:
schema (Dict[str, Any]) –
file (Path) –
- Return type:
- check_file(f: Path) bool [source]¶
- Returns:
True iff the file ‘f’ indicates that the SAST tool is being used in the project
- Parameters:
f (Path) –
- Return type:
bool
- _check_file(f: Path, name_regex: Pattern[str] | None, content_regex: Pattern[str] | None = None) bool [source]¶
- Returns:
True iff a file’s name and contents match the respective regular expressions; if no content regex is supplied only the filename is checked
- Parameters:
f (Path) –
name_regex (Pattern[str] | None) –
content_regex (Pattern[str] | None) –
- Return type:
bool
- property weight: float¶
Not all tools are equally good. Here we supply a rather arbitrary weight to influence how much effect the presence of the tool has on the final score. The higher the weight, the more I like the tool.
- Returns:
The weight
- __annotations__¶
- __dict__¶
- __module__¶
- __weakref__¶
list of weak references to the object (if defined)
- class src.checks.sast_usage_basic.SastUsageBasic(*args: Any, **kwargs: Dict[str, Any])[source]¶
Bases:
CheckInterface
- Parameters:
args (Any) –
kwargs (Dict[str, Any]) –
- exclude: re.Pattern[str]¶
- __init__(*args: Any, **kwargs: Dict[str, Any]) None [source]¶
- Parameters:
args (Any) –
kwargs (Dict[str, Any]) –
- Return type:
None
- __load_tool_schema() Dict[str, Any] ¶
Loads the JSON schema of the tool definitions from permanent storage. config: tool_schema
- Returns:
JSON schema of a single tool
- Return type:
Dict[str, Any]
- __generate_tools() None ¶
Generates the individual JSON tool definitions from a CSV file that describes all of them.
config: tools_csv
effect: populates the directory config::tools_dir
note: no-op if directory is not empty
- Return type:
None
- __load_tools() List[SastTool] ¶
Loads the JSON tool definitions from permanent storage.
config: tools_dir
- Returns:
list of tools
- Return type:
List[SastTool]
- __build_lang_tools() Dict[str, List[SastTool]] ¶
Constructs a mapping from programming languages to tools :return: The mapping
- Return type:
Dict[str, List[SastTool]]
- _detect_sast_tools() Dict[str, List[SastTool]] [source]¶
Performs the actual “analysis”. Builds map that takes programming languages to the set of SAST tools that the project uses for this language.
- Returns:
The mapping
- Return type:
Dict[str, List[SastTool]]
- _calc_score(detected_tools: Dict[str, List[SastTool]]) float [source]¶
Consumes the result of`_detect_sast_tools` and calculates the final score out of it.
- Returns:
score
- Parameters:
detected_tools (Dict[str, List[SastTool]]) –
- Return type:
float
- run(args_dict: Dict[str, Any] | None = None) Dict[str, Any] [source]¶
- Parameters:
args_dict (Dict[str, Any] | None) –
- Return type:
Dict[str, Any]
- __annotations__¶
- __module__¶
src.checks.secrets module¶
Implementation of the Secrets check, which attempts to find leaked secrets in a git repository. It is really just running a bunch of open source tools and collecting their output.
- class src.checks.secrets.DetectSecretsSecret(potential_secret: PotentialSecret)[source]¶
Bases:
SecretInterface
- Parameters:
potential_secret (PotentialSecret) –
- __init__(potential_secret: PotentialSecret)[source]¶
- Parameters:
potential_secret (PotentialSecret) –
- __annotations__¶
- __module__¶
- class src.checks.secrets.DetectSecrets[source]¶
Bases:
SecretsToolInterface
- check_file(f: Path) None [source]¶
Checks the file ‘f’ for secrets and adds any found secrets to the internal state of the tool.
- Parameters:
f (Path) –
- Return type:
None
- maybe_load_baseline(project_id: int) SecretsCollection | None [source]¶
- Parameters:
project_id (int) –
- Return type:
SecretsCollection | None
- create_or_overwrite_baseline(project_id: int) None [source]¶
Creates a new, or overwites an existing, baseline using the tool’s internal state
- Parameters:
project_id (int) –
- Return type:
None
- update_baseline(project_id: int) None [source]¶
Uses the tool’s internal state to update an existing or create a new baseline on disk. Essentially performs an intersection between the internal state and the baseline if there is one, else it just writes the internal state to disk.
Important: This method might change the tool’s internal state.
- Parameters:
project_id (int) –
- Return type:
None
- diff_vs_baseline(project_id: int) Iterable[SecretInterface] [source]¶
The list of all secrets the tool found in a project that are not in the baseline.
- Parameters:
project_id (int) –
- Return type:
Iterable[SecretInterface]
- delete_baseline(project_id: int) None [source]¶
Removes any peristed baselines this tool has for the given project. No-op if there is no baseline.
- Parameters:
project_id (int) –
- Return type:
None
- check_files(files: Iterable[Path]) None [source]¶
Checks the ‘files’ for secrets and adds any found secrets to the internal state of the tool.
- Parameters:
files (Iterable[Path]) –
- Return type:
None
- property detected_secrets: Generator[SecretInterface, None, None]¶
The list of all secrets the tool found in a project. Essentially the tools internal state.
- __annotations__¶
- __module__¶
- class src.checks.secrets.Secrets(proj: Project, repo: Repo, api: Gitlab)[source]¶
Bases:
CheckInterface
Class which represents a check that runs a bunch of secret detection tools against a given project and spits out a ‘score’.
- Parameters:
proj (Project) –
repo (Repo) –
api (Gitlab) –
- exclude: Pattern¶
- secretsTools: List[Type[SecretsToolInterface]]¶
- _detect_secrets() Dict[str, Iterable[SecretInterface]] [source]¶
Generates the set of results that are not in the baseline, i.e, ‘R (R cup B)’, for each tool. Returns the union ‘V’ of these sets. Also updates or creates baselines along the way.
- Return type:
Dict[str, Iterable[SecretInterface]]
- _calc_score(detected_secrets: Dict[str, Iterable[SecretInterface]]) float [source]¶
- Parameters:
detected_secrets (Dict[str, Iterable[SecretInterface]]) –
- Return type:
float
- _process_args(args_dict: Dict[str, Any] | None) None [source]¶
- Parameters:
args_dict (Dict[str, Any] | None) –
- Return type:
None
- run(args_dict: Dict[str, Any] | None = None) Dict[str, Any] [source]¶
- Parameters:
args_dict (Dict[str, Any] | None) –
- Return type:
Dict[str, Any]
- __annotations__¶
- __module__¶
Module contents¶
Module that allows to instantiate any selection of available checks on a project
- src.checks.results_valid(results: Dict[str, Any]) bool [source]¶
Validates the results that a check instance’s run method returned.
- Parameters:
results (Dict[str, Any]) –
- Return type:
bool
- src.checks.validate_args(args_dict: Dict[str, Any]) bool [source]¶
Validates the set of arguments passed to the ‘check’ subcommand
- Parameters:
args_dict (Dict[str, Any]) –
- Return type:
bool
- src.checks.transform_args(args_dict: Dict[str, Any]) Tuple[Path | None, int | None] [source]¶
Transforms the set of arguments passed to the check subcommand
- Parameters:
args_dict (Dict[str, Any]) –
- Return type:
Tuple[Path | None, int | None]
- src.checks.iter_checks(proj: ~gitlab.v4.objects.projects.Project, repo: ~git.repo.base.Repo, api: ~gitlab.client.Gitlab, filter_func: ~typing.Callable[[~typing.Type[~src.interfaces.CheckInterface]], bool] = <function <lambda>>) Iterable[CheckInterface] [source]¶
This method is used to get instances of all checks in the set filter_func(availableChecks) for a single project. yields: check instances for the given project
- Parameters:
proj (Project) –
repo (Repo) –
api (Gitlab) –
filter_func (Callable[[Type[CheckInterface]], bool]) –
- Return type:
Iterable[CheckInterface]