Module: extractor.py
- Purpose:
This module provides abstract syntax tree node visitation and attribute extraction functionality.
- Platform:
Linux/Windows | Python 3.10+
- Developer:
J Berendt
- Email:
- Comments:
n/a
- Example:
Example code use:
>>> from badsnakes.libs.parser import Parser >>> from badsnakes.libs.extractor import Extractor >>> p = Parser() >>> e = Extractor() >>> p.parse(path='hello.py') >>> e.extract(node=p.ast_) # Display the extracted nodes. >>> e.display()
- class badsnakes.libs.extractor.Extractor[source]
Bases:
NodeVisitor
Inspect, extract and store relevant AST node attributes.
- display(name: str = None)[source]
Display the extracted contents.
The extracted attributes for each of the following AST nodes are displayed here:
ast.Assign
ast.Attribute
ast.Call
ast.Constants
ast.FunctionDef
ast.Import
ast.ImportFrom
- Parameters:
name (str, optional) – Name of the Python module being displayed. Defaults to None.
- extract(node: Module)[source]
Extract and store relevant attributes from a parsed AST.
This method is an alias for the
ast.NodeVisitor.visit()
which is called directly, after the docstrings have been extracted.- Parameters:
node (ast.Module) – Starting node to be visited from which attributes are to be extracted.
- visit_Assign(node: Assign)[source]
Extract attributes of interest from
ast.Assign
nodes.Generally, the assignments are used by the analyser to detect (very) long strings, or suspicious module or function aliasing.
For example:
[A very very long string which may be base64 encoded code]
A URL including ‘http’
cexe = exec
lave = eval
_i = __import__
- Parameters:
node (ast.Assign) – A node of type
ast.Assign
.
- visit_Attribute(node: Attribute)[source]
Extract attributes of interest from
ast.Attribute
nodes.For example:
__builtins__.__getattribute__
ctypes.windll
os.system
- Parameters:
node (ast.Attribute) – A node of type
ast.Attribute
.
- visit_Call(node: Call)[source]
Extract attributes of interest from
ast.Call
nodes.Generally, function calls are used by the analyser to detect calls to functions which are generally considered unsafe, or used for suspicious activity.
Additionally, any arguments into these function calls are stored into the
_args
class attribute, to be later added to theModule.arguments
object.For example:
Calls
compile
,exec
oreval
Disguised imports using
__import__
Calls to
requests.post
- Parameters:
node (ast.Call) – A node of type
ast.Call
.
- visit_Constant(node: Constant)[source]
Extract attributes of interest from
ast.Constant
nodes.Generally, the constants of interest here are strings. The extracted strings will be compared against the blacklisted strings to determine if any suspicious activities are being attempted.
- Docstrings:
Often times, a docstring containing benign text such as a semi-colon or the term ‘execute’ can flag a module as dangerous during a string search.
Because of this, the AST is walked to collect and store all docstrings when
extract()
method is called. A constant node is only stored by this method for analysis if the constant’s value was not found in the stored docstrings. For further rationale on this, please refer to the_extract_docstrings()
method.
For example:
Calls to cmd.exe or powershell
References to Bitcoin or other payment demands
Windows registry key paths
- Parameters:
node (ast.Constant) – A node of type
ast.Constant
.
- visit_FunctionDef(node: FunctionDef)[source]
Extract attributes of interest from
ast.FunctionDef
nodes.Generally, the analyser will use these nodes in search of obfuscated function names, indicating suspicious activity.
For example:
_
__
_0xb1
_00OO00OO
_01001001
- Parameters:
node (ast.FunctionDef) – A node of type
ast.FunctionDef
.
- visit_Import(node: Import)[source]
Extract attributes of interest from
ast.Import
nodes.Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.
For example:
import requests
import winreg
import ctypes as ct
import win32api as _win32api
import win32con as _win32con
- Parameters:
node (ast.Import) – A node of type
ast.Import
.
- visit_ImportFrom(node: ImportFrom)[source]
Extract attributes of interest from
ast.ImportFrom
nodes.Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.
For example:
from win32api import SetFileAttributes
from win32con import SRCAND, FILE_ATTRIBUTE_HIDDEN
from win32file import CreateFileW, WriteFile, CloseHandle
- Parameters:
node (ast.ImportFrom) – A node of type
ast.ImportFrom
.
- _extract_docstrings(node: Module)[source]
Collect all docstrings in the module and store.
- Parameters:
node (ast.Module) – Top-level AST node to be searched.
The extracted (uncleaned) docstrings are stored into the
_docs
attribute. A constant is only tested if the value is not in the_docs
attribute.- Rationale:
Extracting and storing docstrings lets us put simple strings such as
';'
and'()'
inconfig.toml
under the[analyser.constant.dangerous]
and[analyser.constant.suspect]
tables without having a false-positive trigger for the string being somewhere in the docstring.
- generic_visit(node)
Called if no explicit visitor function exists for a node.
- visit(node)
Visit a node.