`core.utils`¶

Module Contents¶

Classes¶

`Bunch`	A simple but handy "collector of a bunch of named stuff" class.
`PostThread`	POSTs the given data with the headers to the URL.

Functions¶

`local_lock`(→ Iterator[None])	Locks the given namespace/key combination on the current system,
`normalize_for_url`(→ str)	Takes the given text and makes it fit to be used for an url.
`increment_name`(→ str)	Takes the given name and adds a numbered suffix beginning at 1.
`remove_repeated_spaces`(→ str)	Removes repeated spaces in the text ('a b' -> 'a b').
`profile`(→ Iterator[None])	Profiles the wrapped code and stores the result in the profiles folder
`timing`(→ Iterator[None])	Runs the wrapped code and prints the time in ms it took to run it.
`module_path_root`(→ str)
`module_path`(→ str)	Returns a subdirectory in the given python module.
`touch`(→ None)	Touches the file on the given path.
`render_file`(→ webob.Response)	Takes the given file_path (content) and renders it to the browser.
`hash_dictionary`(→ str)	Computes a sha256 hash for the given dictionary. The dictionary
`groupbylist`(…)	Works just like Python's `itertools.groupby` function, but instead
`linkify_phone`(→ str)	Takes a string and replaces valid phone numbers with html links. If a
`top_level_domains`(→ set[str])
`linkify`(→ str)	Takes plain text and injects html links for urls and email addresses.
`paragraphify`(→ str)	Takes a text with newlines groups them into paragraphs according to the
`to_html_ul`(→ str)	Linkify and convert to text to one or multiple ul's or paragraphs.
`ensure_scheme`(→ str)	Makes sure that the given url has a scheme in front, if none
`is_uuid`(→ bool)	Returns true if the given value is a uuid. The value may be a string
`is_non_string_iterable`(→ bool)	Returns true if the given obj is an iterable, but not a string.
`relative_url`(→ str)	Removes everything in front of the path, including scheme, host,
`is_subpath`(→ bool)	Returns true if the given path is inside the given directory.
`is_sorted`(…)	Returns True if the iterable is sorted.
`morepath_modules`(→ Iterator[str])	Returns all morepath modules which should be scanned for the given
`scan_morepath_modules`(→ None)	Tries to scan all the morepath modules required for the given
`get_unique_hstore_keys`(→ set[str])	Returns a set of keys found in an hstore column over all records
`makeopendir`(→ SubFS[FS])	Creates and opens the given directory in the given PyFilesystem.
`append_query_param`(→ str)	Appends a single query parameter to an url. This is faster than
`toggle`(→ set[_T])	Returns a new set where the item has been toggled.
`binary_to_dictionary`(→ core.types.FileDict)	Takes raw binary filedata and stores it in a dictionary together
`dictionary_to_binary`(→ bytes)	Takes a dictionary created by `binary_to_dictionary()` and returns
`safe_format`(…)	Takes a user-supplied string with format blocks and returns a string
`safe_format_keys`(→ list[str])	Takes a `safe_format()` string and returns the found keys.
`is_valid_yubikey`(→ bool)	Asks the yubico validation servers if the given yubikey OTP is valid.
`is_valid_yubikey_format`(→ bool)	Returns True if the given OTP has the correct format. Does not actually
`yubikey_otp_to_serial`(→ int \| None)	Takes a Yubikey OTP and calculates the serial number of the key.
`yubikey_public_id`(→ str)	Returns the yubikey identity given a token.
`dict_path`(→ _T)	Gets the value of the given dictionary at the given path. For example:
`safe_move`(→ None)	Rename a file from `src` to `dst`.
`batched`(…)	Splits an iterable into batches of batch_size and puts them

Attributes¶

`_T`
`_KT`
`_unwanted_url_chars`
`_double_dash`
`_number_suffix`
`_repeated_spaces`
`_uuid`
`_email_regex`
`_multiple_newlines`
`_phone_inside_a_tags`
`_phone_ch_country_code`
`_phone_ch`
`_phone_ch_html_safe`
`ALPHABET`
`ALPHABET_RE`

core.utils._T[source]¶

core.utils._KT[source]¶

core.utils._unwanted_url_chars[source]¶

core.utils._double_dash[source]¶

core.utils._number_suffix[source]¶

core.utils._repeated_spaces[source]¶

core.utils._uuid[source]¶

core.utils._email_regex[source]¶

core.utils._multiple_newlines[source]¶

core.utils._phone_inside_a_tags = '(\\">|href=\\"tel:)?'[source]¶

core.utils._phone_ch_country_code = '(\\+41|0041|0[0-9]{2})'[source]¶

core.utils._phone_ch[source]¶

core.utils._phone_ch_html_safe[source]¶

core.utils.ALPHABET = 'cbdefghijklnrtuv'[source]¶

core.utils.ALPHABET_RE[source]¶

core.utils.local_lock(namespace: str, key: str) → Iterator[None][source]¶

Locks the given namespace/key combination on the current system, automatically freeing it after the with statement has been completed or once the process is killed.

Usage:

with lock('namespace', 'key'):
    pass

core.utils.normalize_for_url(text: str) → str[source]¶

Takes the given text and makes it fit to be used for an url.

That means replacing spaces and other unwanted characters with ‘-‘, lowercasing everything and turning unicode characters into their closest ascii equivalent using Unidecode.

See https://pypi.python.org/pypi/Unidecode

core.utils.increment_name(name: str) → str[source]¶

Takes the given name and adds a numbered suffix beginning at 1.

For example:

foo => foo-1
foo-1 => foo-2

core.utils.remove_repeated_spaces(text: str) → str[source]¶: Removes repeated spaces in the text (‘a b’ -> ‘a b’).

core.utils.profile(filename: str) → Iterator[None][source]¶: Profiles the wrapped code and stores the result in the profiles folder with the given filename.

core.utils.timing(name: str | None = None) → Iterator[None][source]¶: Runs the wrapped code and prints the time in ms it took to run it. The name is printed in front of the time, if given.

core.utils.module_path_root(module: ModuleType | str) → str[source]¶

core.utils.module_path(module: ModuleType | str, subpath: str) → str[source]¶

Returns a subdirectory in the given python module.

Mod:: A python module (actual module or string)
Subpath:: Subpath below that python module. Leading slashes (‘/’) are ignored.

core.utils.touch(file_path: str) → None[source]¶: Touches the file on the given path.

class core.utils.Bunch(**kwargs: Any)[source]¶

A simple but handy “collector of a bunch of named stuff” class.

See https://code.activestate.com/recipes/52308-the-simple-but-handy-collector-of-a-bunch-of-named/.

For example:

point = Bunch(x=1, y=2)
assert point.x == 1
assert point.y == 2

point.z = 3
assert point.z == 3

Allows the creation of simple nested bunches, for example:

request = Bunch(**{'app.settings.org.my_setting': True})
assert request.app.settings.org.my_setting is True

__eq__(other: object) → bool[source]¶: Return self==value.

__ne__(other: object) → bool[source]¶: Return self!=value.

core.utils.render_file(file_path: str, request: core.request.CoreRequest) → webob.Response[source]¶: Takes the given file_path (content) and renders it to the browser. The file must exist on the local system and be readable by the current process.

core.utils.hash_dictionary(dictionary: dict[str, Any]) → str[source]¶

Computes a sha256 hash for the given dictionary. The dictionary is expected to only contain values that can be serialized by json.

That includes int, decimal, string, boolean.

Note that this function is not meant to be used for hashing secrets. Do not include data in this dictionary that is secret!

core.utils.groupbylist(iterable: collections.abc.Iterable[_T], key: None = ...) → list[tuple[_T, list[_T]]][source]¶
core.utils.groupbylist(iterable: collections.abc.Iterable[_T], key: Callable[[_T], _KT]) → list[tuple[_KT, list[_T]]]: Works just like Python’s itertools.groupby function, but instead of returning generators, it returns lists.

core.utils.linkify_phone(text: str) → str[source]¶: Takes a string and replaces valid phone numbers with html links. If a phone number is matched, it will be replaced by the result of a callback function, that does further checks on the regex match. If these checks do not pass, the matched number will remain unchanged.

core.utils.top_level_domains() → set[str][source]¶

core.utils.linkify(text: str | None, escape: bool = True) → str[source]¶

Takes plain text and injects html links for urls and email addresses.

By default the text is html escaped before it is linkified. This accounts for the fact that we usually use this for text blocks that we mean to extend with email addresses and urls.

If html is already possible, why linkify it?

Note: We need to clean the html after we’ve created it (linkify parses escaped html and turns it into real html). As a consequence it is possible to have html urls in the text that won’t be escaped.

core.utils.paragraphify(text: str) → str[source]¶

Takes a text with newlines groups them into paragraphs according to the following rules:

If there’s a single newline between two lines, a <br> will replace that newline.

If there are multiple newlines between two lines, each line will become a paragraph and the extra newlines are discarded.

core.utils.to_html_ul(value: str | None, convert_dashes: bool = True, with_title: bool = False) → str[source]¶: Linkify and convert to text to one or multiple ul’s or paragraphs.

core.utils.ensure_scheme(url: str, default: str = 'http') → str[source]¶: Makes sure that the given url has a scheme in front, if none was provided.

core.utils.is_uuid(value: str | uuid.UUID) → bool[source]¶: Returns true if the given value is a uuid. The value may be a string or of type UUID. If it’s a string, the uuid is checked with a regex.

core.utils.is_non_string_iterable(obj: object) → bool[source]¶: Returns true if the given obj is an iterable, but not a string.

core.utils.relative_url(absolute_url: str | None) → str[source]¶: Removes everything in front of the path, including scheme, host, username, password and port.

core.utils.is_subpath(directory: str, path: str) → bool[source]¶: Returns true if the given path is inside the given directory.

core.utils.is_sorted(iterable: Iterable[SupportsRichComparison], key: Callable[[SupportsRichComparison], SupportsRichComparison] = ..., reverse: bool = ...) → bool[source]¶
core.utils.is_sorted(iterable: Iterable[_T], key: Callable[[_T], SupportsRichComparison], reverse: bool = ...) → bool: Returns True if the iterable is sorted.

core.utils.morepath_modules(cls: type[morepath.App]) → Iterator[str][source]¶

Returns all morepath modules which should be scanned for the given morepath application class.

We can’t reliably know the actual morepath modules that need to be scanned, which is why we assume that each module has one namespace (like ‘more.transaction’ or ‘onegov.core’).

core.utils.scan_morepath_modules(cls: type[morepath.App]) → None[source]¶: Tries to scan all the morepath modules required for the given application class. This is not guaranteed to stay reliable as there is no sure way to discover all modules required by the application class.

core.utils.get_unique_hstore_keys(session: sqlalchemy.orm.Session, column: Column[dict[str, Any]]) → set[str][source]¶: Returns a set of keys found in an hstore column over all records of its table.

core.utils.makeopendir(fs: fs.base.FS, directory: str) → SubFS[FS][source]¶: Creates and opens the given directory in the given PyFilesystem.

core.utils.append_query_param(url: str, key: str, value: str) → str[source]¶

Appends a single query parameter to an url. This is faster than using Purl, if and only if we only add one query param.

Also this function assumes that the value is already url encoded.

class core.utils.PostThread(url: str, data: bytes, headers: Collection[tuple[str, str]], timeout: float = 30)[source]¶

Bases: threading.Thread

POSTs the given data with the headers to the URL.

Example:

data = {'a': 1, 'b': 2}
data = json.dumps(data).encode('utf-8')
PostThread(
    'https://example.com/post',
    data,
    (
        ('Content-Type', 'application/json; charset=utf-8'),
        ('Content-Length', len(data))
    )
).start()

This only works for external URLs! If posting to server itself is needed, use a process instead of the thread!

run() → None[source]¶

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

core.utils.toggle(collection: set[_T], item: _T | None) → set[_T][source]¶: Returns a new set where the item has been toggled.

core.utils.binary_to_dictionary(binary: bytes, filename: str | None = None) → core.types.FileDict[source]¶

Takes raw binary filedata and stores it in a dictionary together with metadata information.

The data is compressed before it is stored int he dictionary. Use dictionary_to_binary() to get the original binary data back.

core.utils.dictionary_to_binary(dictionary: core.types.LaxFileDict) → bytes[source]¶: Takes a dictionary created by binary_to_dictionary() and returns the original binary data.

core.utils.safe_format(format: str, dictionary: dict[str, str | int | float], types: None = ..., adapt: Callable[[str], str] | None = ..., raise_on_missing: bool = ...) → str[source]¶

core.utils.safe_format(format: str, dictionary: dict[str, _T], types: set[type[_T]] = ..., adapt: Callable[[str], str] | None = ..., raise_on_missing: bool = ...) → str

Takes a user-supplied string with format blocks and returns a string where those blocks are replaced by values in a dictionary.

For example:

>>> safe_format('[user] has logged in', {'user': 'admin'})
'admin has logged in'

Parameters:

format – The format to use. Square brackets denote dictionary keys. To literally print square bracktes, mask them by doubling (‘[[’ -> ‘[‘)
dictionary – The dictionary holding the variables to use. If the key is not found in the dictionary, the bracket is replaced with an empty string.
types –
A set of types supported by the dictionary. Limiting this to safe types like builtins (str, int, float) ensure that no values are accidentally leaked through faulty __str__ representations.

Note that inheritance is ignored. Supported types need to be whitelisted explicitly.
adapt – An optional callable that receives the key before it is used. Returns the same key or an altered version.
raise_on_missing – True if missing keys should result in a runtime error (defaults to False).

This is strictly meant for formats provided by users. Python’s string formatting options are clearly superior to this, however it is less secure!

core.utils.safe_format_keys(format: str, adapt: Callable[[str], str] | None = None) → list[str][source]¶: Takes a safe_format() string and returns the found keys.

core.utils.is_valid_yubikey(client_id: str, secret_key: str, expected_yubikey_id: str, yubikey: str) → bool[source]¶

Asks the yubico validation servers if the given yubikey OTP is valid.

Client_id:: The yubico API client id.
Secret_key:: The yubico API secret key.
Expected_yubikey_id:: The expected yubikey id. The yubikey id is defined as the first twelve characters of any yubikey value. Each user should have a yubikey associated with it’s account. If the yubikey value comes from a different key, the key is invalid.
Yubikey:: The actual yubikey value that should be verified.
Returns:: True if yubico confirmed the validity of the key.

core.utils.is_valid_yubikey_format(otp: str) → bool[source]¶: Returns True if the given OTP has the correct format. Does not actually contact Yubico, so this function may return true, for some invalid keys.

core.utils.yubikey_otp_to_serial(otp: str) → int | None[source]¶

Takes a Yubikey OTP and calculates the serial number of the key.

The serial key is printed on the yubikey, in decimal and as a QR code.

Example:

>>> yubikey_otp_to_serial(
    'ccccccdefghdefghdefghdefghdefghdefghdefghklv')
2311522

Adapted from Java:

https://github.com/Yubico/yubikey-salesforce-client/blob/ e38e46ee90296a852374a8b744555e99d16b6ca7/src/classes/Modhex.cls

If the key cannot be calculated, None is returned. This can happen if they key is malformed.

core.utils.yubikey_public_id(otp: str) → str[source]¶: Returns the yubikey identity given a token.

core.utils.dict_path(dictionary: dict[str, _T], path: str) → _T[source]¶

Gets the value of the given dictionary at the given path. For example:

>>> data = {'foo': {'bar': True}}
>>> dict_path(data, 'foo.bar')
True

core.utils.safe_move(src: str, dst: str, tmp_dst: str | None = None) → None[source]¶

Rename a file from src to dst.

Optionally provide a tmp_dst where the file will be copied to before being renamed. This needs to be on the same filesystem as tmp, otherwise this will fail.

Moves must be atomic. shutil.move() is not atomic.
Moves must work across filesystems. Often temp directories and the cache directories live on different filesystems. os.rename() can throw errors if run across filesystems.

So we try os.rename(), but if we detect a cross-filesystem copy, we switch to shutil.move() with some wrappers to make it atomic.

Via https://alexwlchan.net/2019/03/atomic-cross-filesystem-moves-in-python

core.utils.batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[tuple] = ...) → Iterator[tuple[_T, ...]][source]¶

core.utils.batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: type[list]) → Iterator[list[_T]]

core.utils.batched(iterable: collections.abc.Iterable[_T], batch_size: int, container_factory: Callable[[Iterator[_T]], Collection[_T]]) → Iterator[Collection[_T]]

Splits an iterable into batches of batch_size and puts them inside a given collection (tuple by default).

The container_factory is necessary in order to consume the iterator returned by islice. Otherwise this function would never return.

core.utils¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`core.utils`¶