Module: extractor.py

Purpose:

This module provides abstract syntax tree node visitation and attribute extraction functionality.

Platform:

Linux/Windows | Python 3.10+

Developer:

J Berendt

Email:

development@s3dev.uk

Comments:

n/a

Example:

Example code use:

>>> from badsnakes.libs.parser import Parser
>>> from badsnakes.libs.extractor import Extractor

>>> p = Parser()
>>> e = Extractor()
>>> p.parse(path='hello.py')
>>> e.extract(node=p.ast_)

# Display the extracted nodes.
>>> e.display()
class badsnakes.libs.extractor.Extractor[source]

Bases: NodeVisitor

Inspect, extract and store relevant AST node attributes.

display(name: str = None)[source]

Display the extracted contents.

The extracted attributes for each of the following AST nodes are displayed here:

  • ast.Assign

  • ast.Attribute

  • ast.Call

  • ast.Constants

  • ast.FunctionDef

  • ast.Import

  • ast.ImportFrom

Parameters:

name (str, optional) – Name of the Python module being displayed. Defaults to None.

extract(node: Module)[source]

Extract and store relevant attributes from a parsed AST.

This method is an alias for the ast.NodeVisitor.visit() which is called directly, after the docstrings have been extracted.

Parameters:

node (ast.Module) – Starting node to be visited from which attributes are to be extracted.

visit_Assign(node: Assign)[source]

Extract attributes of interest from ast.Assign nodes.

Generally, the assignments are used by the analyser to detect (very) long strings, or suspicious module or function aliasing.

For example:

  • [A very very long string which may be base64 encoded code]

  • A URL including ‘http’

  • cexe = exec

  • lave = eval

  • _i = __import__

Parameters:

node (ast.Assign) – A node of type ast.Assign.

visit_Attribute(node: Attribute)[source]

Extract attributes of interest from ast.Attribute nodes.

For example:

  • __builtins__.__getattribute__

  • ctypes.windll

  • os.system

Parameters:

node (ast.Attribute) – A node of type ast.Attribute.

visit_Call(node: Call)[source]

Extract attributes of interest from ast.Call nodes.

Generally, function calls are used by the analyser to detect calls to functions which are generally considered unsafe, or used for suspicious activity.

Additionally, any arguments into these function calls are stored into the _args class attribute, to be later added to the Module.arguments object.

For example:

  • Calls compile, exec or eval

  • Disguised imports using __import__

  • Calls to requests.post

Parameters:

node (ast.Call) – A node of type ast.Call.

visit_Constant(node: Constant)[source]

Extract attributes of interest from ast.Constant nodes.

Generally, the constants of interest here are strings. The extracted strings will be compared against the blacklisted strings to determine if any suspicious activities are being attempted.

Docstrings:

Often times, a docstring containing benign text such as a semi-colon or the term ‘execute’ can flag a module as dangerous during a string search.

Because of this, the AST is walked to collect and store all docstrings when extract() method is called. A constant node is only stored by this method for analysis if the constant’s value was not found in the stored docstrings. For further rationale on this, please refer to the _extract_docstrings() method.

For example:

  • Calls to cmd.exe or powershell

  • References to Bitcoin or other payment demands

  • Windows registry key paths

Parameters:

node (ast.Constant) – A node of type ast.Constant.

visit_FunctionDef(node: FunctionDef)[source]

Extract attributes of interest from ast.FunctionDef nodes.

Generally, the analyser will use these nodes in search of obfuscated function names, indicating suspicious activity.

For example:

  • _

  • __

  • _0xb1

  • _00OO00OO

  • _01001001

Parameters:

node (ast.FunctionDef) – A node of type ast.FunctionDef.

visit_Import(node: Import)[source]

Extract attributes of interest from ast.Import nodes.

Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.

For example:

  • import requests

  • import winreg

  • import ctypes as ct

  • import win32api as _win32api

  • import win32con as _win32con

Parameters:

node (ast.Import) – A node of type ast.Import.

visit_ImportFrom(node: ImportFrom)[source]

Extract attributes of interest from ast.ImportFrom nodes.

Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.

For example:

  • from win32api import SetFileAttributes

  • from win32con import SRCAND, FILE_ATTRIBUTE_HIDDEN

  • from win32file import CreateFileW, WriteFile, CloseHandle

Parameters:

node (ast.ImportFrom) – A node of type ast.ImportFrom.

_extract_docstrings(node: Module)[source]

Collect all docstrings in the module and store.

Parameters:

node (ast.Module) – Top-level AST node to be searched.

The extracted (uncleaned) docstrings are stored into the _docs attribute. A constant is only tested if the value is not in the _docs attribute.

Rationale:

Extracting and storing docstrings lets us put simple strings such as ';' and '()' in config.toml under the [analyser.constant.dangerous] and [analyser.constant.suspect] tables without having a false-positive trigger for the string being somewhere in the docstring.

generic_visit(node)

Called if no explicit visitor function exists for a node.

visit(node)

Visit a node.