pyrocko.util¶
Utility functions for Pyrocko.
High precision time handling mode¶
Pyrocko can treat timestamps either as standard double precision (64 bit)
floating point values, or as high precision floats (numpy.float128
or
numpy.float96
, whichever is available, see NumPy Scalars), aliased here
as hpfloat
. High precision time stamps are required
when handling data with sub-millisecond precision, i.e. kHz/MHz data streams
and event catalogs derived from such data.
Not all functions in Pyrocko and in programs depending on Pyrocko may work correctly with high precision times. Therefore, Pyrocko’s high precision time handling mode has to be actively activated by user config, command line option or enforced within a certain script/program.
The default high precision time handling mode can be configured globally with
the user configuration variable
use_high_precision_time
. Calling the
function use_high_precision_time()
overrides the default from the
config file. This function may be called at startup of a program/script
requiring a specific time handling mode.
To create a valid time stamp for use in Pyrocko (e.g. in
Event
or
Trace
objects), use:
import time
from pyrocko import util
# By default using mode selected in user config, override with:
# util.use_high_precision_time(True) # force high precision mode
# util.use_high_precision_time(False) # force low precision mode
t1 = util.str_to_time('2020-08-27 10:22:00')
t2 = util.str_to_time('2020-08-27 10:22:00.111222')
t3 = util.to_time_float(time.time())
# To get the appropriate float class, use:
time_float = util.get_time_float()
# -> float, numpy.float128 or numpy.float96
[isinstance(t, time_float) for t in [t1, t2, t3]]
# -> [True, True, True]
# Shortcut:
util.check_time_class(t1)
Module content¶
- class hpfloat¶
Alias for NumPy’s high precision float data type
float128
orfloat96
, if available.On platforms lacking support for high precision floats, an attempt to create a
hpfloat
instance, raisesHPFloatUnavailable
.
Functions
|
Return evenly spaced numbers over a specified interval. |
|
Decode base36 endcoded positive integer. |
|
Convert positive integer to a base36 string. |
|
Type-check variable against current time handling mode. |
|
Check for inconsistencies. |
|
Convert string representing UTC time to system time. |
|
Get beginning of day for any point in time. |
|
Downsample the signal x by an integer factor q, using an order n filter |
|
Get integer decimation sequence for given downampling factor. |
|
Approximate 1st or 2nd derivative of an array. |
|
Approximate first derivative of an array (second order, central FD). |
|
Approximate first derivative of an array (forth order, central FD). |
|
Approximate second derivative of an array (second order, central FD). |
|
Approximate second derivative of an array (forth order, central FD). |
|
Create directory and all intermediate path components to it as needed. |
|
Create all intermediate path components for a target path. |
|
Backslash-escape double-quotes and backslashes. |
|
Backslash-escape single-quotes and backslashes. |
|
Greatest common divisor. |
Try to import threadpoolctl.threadpool_limits, provide dummy if not avail. |
|
Get effective NumPy float class to handle timestamps. |
|
Get the effective float class for timestamps. |
|
Caching variant of |
|
|
Pretty print floating point numbers. |
|
Get string representation from system time, UTC. |
|
Get string representation from system time, UTC. |
|
Get string representation from system time, UTC. |
|
Time offset t_gps - t_utc for a given t_utc. |
|
Get beginning of hour for any point in time. |
|
Yields begin and end of days until given time span is covered. |
|
Yields begin and end of months until given time span is covered. |
|
Recursively select files (generator variant). |
|
Yields begin and end of years until given time span is covered. |
|
Get the day number after the 1st of January of year in |
|
Least common multiple. |
|
Match network-station-location-channel code against pattern or list of patterns. |
|
Get network-station-location-channel codes that match given pattern or any of several given patterns. |
|
Make table with decimation sequences. |
|
Get beginning of month for any point in time. |
|
Calculate definite integral of piece-wise linear function on intervals. |
|
Fit piece-wise linear function to data. |
|
Notify user that an operation has started. |
|
Notify user that an operation has ended. |
|
Join sequence of strings into a line, double-quoting non-trivial strings. |
|
Join sequence of strings into a line, single-quoting non-trivial strings. |
|
Split line into list of strings, allowing for quoted strings. |
|
Extract leap second information from tzdata. |
|
Get unique instance of an object. |
|
Recursively select files. |
|
Initialize logging. |
|
Convert string representing UTC time to floating point system time. |
Default |
|
|
Convert string representing UTC time to floating point system time. |
Get arguments from previous call to setup_logging. |
|
|
Get string representation for floating point system time. |
Convert float to valid time stamp in the current time handling mode. |
|
|
Get string representation for floating point system time. |
|
Unescape backslash-escaped double-quotes and backslashes. |
|
Unescape backslash-escaped single-quotes and backslashes. |
|
Unpack fixed format string, as produced by many fortran codes. |
|
Globally force a specific time handling mode. |
|
Time offset t_utc - t_gps for a given t_gps. |
|
Check time range supported by the systems's time conversion functions. |
|
Paragraph and list-aware wrapping of text. |
|
Get beginning of year for any point in time. |
Classes
|
Dict-to-object utility. |
|
Use POSIX advisory file locking to ensure that only a single instance of a program is running. |
Simple stopwatch to measure elapsed wall clock time. |
|
|
Read table of space separated values from a file. |
|
Write table of space separated values to a file. |
Exceptions
Raised by |
|
Raised when a download failed. |
|
Exception raised by |
|
Exception raised by |
|
Raised when a high precision float type would be required but is not available. |
|
Raised when the download target file already exists. |
|
Exception raised by objects of type |
|
Raised for invalid time strings. |
|
Exception raised by |
|
Exception raised when |
- setup_logging(programname='pyrocko', levelname='warning')[source]¶
Initialize logging.
- Parameters:
programname – program name to be written in log
levelname – string indicating the logging level (‘debug’, ‘info’, ‘warning’, ‘error’, ‘critical’)
This function is called at startup by most pyrocko programs to set up a consistent logging format. This is simply a shortcut to a call to
logging.basicConfig()
.
- subprocess_setup_logging_args()[source]¶
Get arguments from previous call to setup_logging.
These can be sent down to a worker process so it can setup its logging in the same way as the main process.
- exception PathExists[source]¶
Bases:
DownloadError
Raised when the download target file already exists.
Bases:
Exception
Raised when a high precision float type would be required but is not available.
- use_high_precision_time(enabled)[source]¶
Globally force a specific time handling mode.
See High precision time handling mode.
- Parameters:
enabled (bool) – enable/disable use of high precision time type
This function should be called before handling/reading any time data. It can only be called once.
Special attention is required when using multiprocessing on a platform which does not use fork under the hood. In such cases, the desired setting must be set also in the subprocess.
- class Stopwatch[source]¶
Bases:
object
Simple stopwatch to measure elapsed wall clock time.
Usage:
s = Stopwatch() time.sleep(1) print s() time.sleep(1) print s()
- progress_beg(label)[source]¶
Notify user that an operation has started.
- Parameters:
label – name of the operation
To be used in conjuction with
progress_end()
.
- progress_end(label='')[source]¶
Notify user that an operation has ended.
- Parameters:
label – name of the operation
To be used in conjuction with
progress_beg()
.
- exception ArangeError[source]¶
Bases:
ValueError
Raised by
arange2()
for inconsistent range specifications.
- arange2(start, stop, step, dtype=<class 'float'>, epsilon=1e-06, error='raise')[source]¶
Return evenly spaced numbers over a specified interval.
Like
numpy.arange()
but returning floating point numbers by default and with defined behaviour when stepsize is inconsistent with interval bounds. It is considered inconsistent if the difference between the closest multiple ofstep
andstop
is larger thanepsilon * step
. Inconsistencies are handled according to theerror
parameter. If it is set to'raise'
an exception of typeArangeError
is raised. If it is set to'round'
,'floor'
, or'ceil'
,stop
is silently changed to the closest, the next smaller, or next larger multiple ofstep
, respectively.
- polylinefit(x, y, n_or_xnodes)[source]¶
Fit piece-wise linear function to data.
- Parameters:
x,y – arrays with coordinates of data
n_or_xnodes – int, number of segments or x coordinates of polyline
- Returns:
(xnodes, ynodes, rms_error) arrays with coordinates of polyline, root-mean-square error
- plf_integrate_piecewise(x_edges, x, y)[source]¶
Calculate definite integral of piece-wise linear function on intervals.
Use trapezoidal rule to calculate definite integral of a piece-wise linear function for a series of consecutive intervals.
x_edges
andx
must be sorted.- Parameters:
x_edges – array with edges of the intervals
x,y – arrays with coordinates of piece-wise linear function’s control points
- diff_fd_1d_4o(dt, data)[source]¶
Approximate first derivative of an array (forth order, central FD).
- Parameters:
dt – sampling interval
data – NumPy array with data samples
- Returns:
NumPy array with same shape as input
Interior points are approximated to fourth order, edge points to first order right- or left-sided respectively, points next to edge to second order central.
- diff_fd_1d_2o(dt, data)[source]¶
Approximate first derivative of an array (second order, central FD).
- Parameters:
dt – sampling interval
data – NumPy array with data samples
- Returns:
NumPy array with same shape as input
Interior points are approximated to second order, edge points to first order right- or left-sided respectively.
Uses
numpy.gradient()
.
- diff_fd_2d_4o(dt, data)[source]¶
Approximate second derivative of an array (forth order, central FD).
- Parameters:
dt – sampling interval
data – NumPy array with data samples
- Returns:
NumPy array with same shape as input
Interior points are approximated to fourth order, next-to-edge points to second order, edge points repeated.
- diff_fd_2d_2o(dt, data)[source]¶
Approximate second derivative of an array (second order, central FD).
- Parameters:
dt – sampling interval
data – NumPy array with data samples
- Returns:
NumPy array with same shape as input
Interior points are approximated to second order, edge points repeated.
- diff_fd(n, order, dt, data)[source]¶
Approximate 1st or 2nd derivative of an array.
- Parameters:
n – 1 for first derivative, 2 for second
order – order of the approximation 2 and 4 are supported
dt – sampling interval
data – NumPy array with data samples
- Returns:
NumPy array with same shape as input
This is a frontend to the functions
diff_fd_1d_2o()
,diff_fd_1d_4o()
,diff_fd_2d_2o()
, anddiff_fd_2d_4o()
.Raises
ValueError
for unsupported n or order.
- decimate(x, q, n=None, ftype='iir', zi=None, ioff=0)[source]¶
Downsample the signal x by an integer factor q, using an order n filter
By default, an order 8 Chebyshev type I filter is used or a 30 point FIR filter with hamming window if ftype is ‘fir’.
- Parameters:
x – the signal to be downsampled (1D
numpy.ndarray
)q – the downsampling factor
n – order of the filter (1 less than the length of the filter for a fir filter)
ftype – type of the filter; can be iir, fir or fir-remez
- Returns:
the downsampled signal (1D
numpy.ndarray
)
- mk_decitab(nmax=100)[source]¶
Make table with decimation sequences.
Decimation from one sampling rate to a lower one is achieved by a successive application of
decimate()
with small integer downsampling factors (because using large downsampling factors can make the decimation unstable or slow). This function sets up a table with downsample sequences for factors up tonmax
.
- working_system_time_range(year_min_lim=None, year_max_lim=None)[source]¶
Check time range supported by the systems’s time conversion functions.
Returns system time stamps of start of year of first/last fully supported year span. If this is before 1900 or after 2100, return first/last century which is fully supported.
- Returns:
(tmin, tmax, year_min, year_max)
- get_working_system_time_range()[source]¶
Caching variant of
working_system_time_range()
.
- julian_day_of_year(timestamp)[source]¶
Get the day number after the 1st of January of year in
timestamp
.- Returns:
day number as int
- hour_start(timestamp)[source]¶
Get beginning of hour for any point in time.
- Parameters:
timestamp – time instant as system timestamp (in seconds)
- Returns:
instant of hour start as system timestamp
- day_start(timestamp)[source]¶
Get beginning of day for any point in time.
- Parameters:
timestamp – time instant as system timestamp (in seconds)
- Returns:
instant of day start as system timestamp
- month_start(timestamp)[source]¶
Get beginning of month for any point in time.
- Parameters:
timestamp – time instant as system timestamp (in seconds)
- Returns:
instant of month start as system timestamp
- year_start(timestamp)[source]¶
Get beginning of year for any point in time.
- Parameters:
timestamp – time instant as system timestamp (in seconds)
- Returns:
instant of year start as system timestamp
- iter_days(tmin, tmax)[source]¶
Yields begin and end of days until given time span is covered.
- Parameters:
tmin,tmax – input time span
- Yields:
tuples with (begin, end) of days as system timestamps
- iter_months(tmin, tmax)[source]¶
Yields begin and end of months until given time span is covered.
- Parameters:
tmin,tmax – input time span
- Yields:
tuples with (begin, end) of months as system timestamps
- iter_years(tmin, tmax)[source]¶
Yields begin and end of years until given time span is covered.
- Parameters:
tmin,tmax – input time span
- Yields:
tuples with (begin, end) of years as system timestamps
- decitab(n)[source]¶
Get integer decimation sequence for given downampling factor.
- Parameters:
n – target decimation factor
- Returns:
tuple with downsampling sequence
- ctimegm(s, format='%Y-%m-%d %H:%M:%S')[source]¶
Convert string representing UTC time to system time.
- Parameters:
s – string to be interpreted
format – format string passed to
time.strptime()
- Returns:
system time stamp
Interpretes string with format
'%Y-%m-%d %H:%M:%S'
, using strptime.Note
This function is to be replaced by
str_to_time()
.
- gmctime(t, format='%Y-%m-%d %H:%M:%S')[source]¶
Get string representation from system time, UTC.
Produces string with format
'%Y-%m-%d %H:%M:%S'
, using strftime.Note
This function is to be repaced by
time_to_str()
.
- gmctime_v(t, format='%a, %d %b %Y %H:%M:%S')[source]¶
Get string representation from system time, UTC. Same as
gmctime()
but with a more verbose default format.Note
This function is to be replaced by
time_to_str()
.
- gmctime_fn(t, format='%Y-%m-%d_%H-%M-%S')[source]¶
Get string representation from system time, UTC. Same as
gmctime()
but with a default usable in filenames.Note
This function is to be replaced by
time_to_str()
.
- exception FractionalSecondsMissing[source]¶
Bases:
TimeStrError
Exception raised by
str_to_time()
when the given string lacks fractional seconds.
- exception FractionalSecondsWrongNumberOfDigits[source]¶
Bases:
TimeStrError
Exception raised by
str_to_time()
when the given string has an incorrect number of digits in the fractional seconds part.
- str_to_time(s, format='%Y-%m-%d %H:%M:%S.OPTFRAC')[source]¶
Convert string representing UTC time to floating point system time.
- Parameters:
s – string representing UTC time
format – time string format
- Returns:
system time stamp as floating point value
Uses the semantics of
time.strptime()
but allows for fractional seconds. If the format ends with'.FRAC'
, anything after a dot is interpreted as fractional seconds. If the format ends with'.OPTFRAC'
, the fractional part, including the dot is made optional. The latter has the consequence, that the time strings and the format may not contain any other dots. If the format ends with'.xFRAC'
where x is 1, 2, or 3, it is ensured, that exactly that number of digits are present in the fractional seconds.
- stt(s, format='%Y-%m-%d %H:%M:%S.OPTFRAC')¶
Convert string representing UTC time to floating point system time.
- Parameters:
s – string representing UTC time
format – time string format
- Returns:
system time stamp as floating point value
Uses the semantics of
time.strptime()
but allows for fractional seconds. If the format ends with'.FRAC'
, anything after a dot is interpreted as fractional seconds. If the format ends with'.OPTFRAC'
, the fractional part, including the dot is made optional. The latter has the consequence, that the time strings and the format may not contain any other dots. If the format ends with'.xFRAC'
where x is 1, 2, or 3, it is ensured, that exactly that number of digits are present in the fractional seconds.
- str_to_time_fillup(s)[source]¶
Default
str_to_time()
with filling in of missing values.Allows e.g. ‘2010-01-01 00:00:00’ as ‘2010-01-01 00:00’, ‘2010-01-01 00’, …, or ‘2010’.
- time_to_str(t, format='%Y-%m-%d %H:%M:%S.3FRAC')[source]¶
Get string representation for floating point system time.
- Parameters:
t – floating point system time
format – time string format
- Returns:
string representing UTC time
Uses the semantics of
time.strftime()
but additionally allows for fractional seconds. Ifformat
contains'.xFRAC'
, wherex
is a digit between 1 and 9, this is replaced with the fractional part oft
withx
digits precision.
- tts(t, format='%Y-%m-%d %H:%M:%S.3FRAC')¶
Get string representation for floating point system time.
- Parameters:
t – floating point system time
format – time string format
- Returns:
string representing UTC time
Uses the semantics of
time.strftime()
but additionally allows for fractional seconds. Ifformat
contains'.xFRAC'
, wherex
is a digit between 1 and 9, this is replaced with the fractional part oft
withx
digits precision.
- ensuredirs(dst)[source]¶
Create all intermediate path components for a target path.
- Parameters:
dst – target path
The leaf part of the target path is not created (use
ensuredir()
if a the target path is a directory to be created).
- ensuredir(dst)[source]¶
Create directory and all intermediate path components to it as needed.
- Parameters:
dst – directory name
Nothing is done if the given target already exists.
- reuse(x)[source]¶
Get unique instance of an object.
- Parameters:
x – hashable object
- Returns:
reference to x or an equivalent object
Cache object
x
in a global dict for reuse, or if x already is in that dict, return a reference to it.
- class Anon(**dict)[source]¶
Bases:
object
Dict-to-object utility.
Any given arguments are stored as attributes.
Example:
a = Anon(x=1, y=2) print a.x, a.y
- iter_select_files(paths, include=None, exclude=None, selector=None, show_progress=True, pass_through=None)[source]¶
Recursively select files (generator variant).
See
select_files()
.
- select_files(paths, include=None, exclude=None, selector=None, show_progress=True, regex=None)[source]¶
Recursively select files.
- Parameters:
paths – entry path names
include – pattern for conditional inclusion
exclude – pattern for conditional exclusion
selector – callback for conditional inclusion
show_progress – if True, indicate start and stop of processing
regex – alias for
include
(backwards compatibility)
- Returns:
list of path names
Recursively finds all files under given entry points
paths
. If parameterinclude
is a regular expression, only files with matching path names are included. If additionally parameterselector
is given a callback function, only files for which the callback returnsTrue
are included. The callback should take a single argument. The callback is called with a single argument, an object, having as attributes, any named groups given ininclude
.Examples
To find all files ending in
'.mseed'
or'.msd'
:select_files(paths, include=r'\.(mseed|msd)$')
To find all files ending with
'$Year.$DayOfYear'
, having set 2009 for the year:select_files(paths, include=r'(?P<year>\d\d\d\d)\.(?P<doy>\d\d\d)$', selector=(lambda x: int(x.year) == 2009))
- base36encode(number, alphabet='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')[source]¶
Convert positive integer to a base36 string.
- exception UnpackError[source]¶
Bases:
Exception
Exception raised when
unpack_fixed()
encounters an error.
- unpack_fixed(format, line, *callargs)[source]¶
Unpack fixed format string, as produced by many fortran codes.
- Parameters:
format – format specification
line – string to be processed
callargs – callbacks for callback fields in the format
The format is described by a string of comma-separated fields. Each field is defined by a character for the field type followed by the field width. A questionmark may be appended to the field description to allow the argument to be optional (The data string is then allowed to be filled with blanks and
None
is returned in this case).The following field types are available:
Type
Description
A
string (full field width is extracted)
a
string (whitespace at the beginning and the end is removed)
i
integer value
f
floating point value
@
special type, a callback must be given for the conversion
x
special field type to skip parts of the string
- match_nslc(patterns, nslc)[source]¶
Match network-station-location-channel code against pattern or list of patterns.
- Parameters:
patterns – pattern or list of patterns
nslc – tuple with (network, station, location, channel) as strings
- Returns:
True
if the pattern matches or if any of the given patterns match; orFalse
.
The patterns may contain shell-style wildcards: *, ?, [seq], [!seq].
Example:
match_nslc('*.HAM3.*.BH?', ('GR', 'HAM3', '', 'BHZ')) # -> True
- match_nslcs(patterns, nslcs)[source]¶
Get network-station-location-channel codes that match given pattern or any of several given patterns.
- Parameters:
patterns – pattern or list of patterns
nslcs – list of (network, station, location, channel) tuples
See also
match_nslc()
- exception SoleError[source]¶
Bases:
Exception
Exception raised by objects of type
Sole
, when an concurrent instance is running.
- class Sole(pid_path)[source]¶
Bases:
object
Use POSIX advisory file locking to ensure that only a single instance of a program is running.
- Parameters:
pid_path – path to lockfile to be used
Usage:
from pyrocko.util import Sole, SoleError, setup_logging import os setup_logging('my_program') pid_path = os.path.join(os.environ['HOME'], '.my_program_lock') try: sole = Sole(pid_path) except SoleError, e: logger.fatal( str(e) ) sys.exit(1)
- class TableWriter(f)[source]¶
Bases:
object
Write table of space separated values to a file.
- Parameters:
f – file like object
Strings containing spaces are quoted on output.
- writerow(row, minfieldwidths=None)[source]¶
Write one row of values to underlying file.
- Parameters:
row – iterable of values
minfieldwidths – minimum field widths for the values
Each value in in
row
is converted to a string and optionally padded with blanks. The resulting strings are output separated with blanks. If any values given are strings and if they contain whitespace, they are quoted with single quotes, and any internal single quotes are backslash-escaped.
- class TableReader(f)[source]¶
Bases:
object
Read table of space separated values from a file.
- Parameters:
f – file-like object
This uses Pythons shlex module to tokenize lines. Should deal correctly with quoted strings.
- gform(number, significant_digits=3)[source]¶
Pretty print floating point numbers.
Align floating point numbers at the decimal dot.
| -d.dde+xxx| | -d.dde+xx | |-ddd. | | -dd.d | | -d.dd | | -0.ddd | | -0.0ddd | | -0.00ddd | | -d.dde-xx | | -d.dde-xxx| | nan|
The formatted string has length
significant_digits * 2 + 6
.
- read_leap_seconds(tzfile='/usr/share/zoneinfo/right/UTC')[source]¶
Extract leap second information from tzdata.
Based on example at http://stackoverflow.com/questions/19332902/ extract-historic-leap-seconds-from-tzdata
See also ‘man 5 tzfile’.
- consistency_check(list_of_tuples, message='values differ:')[source]¶
Check for inconsistencies.
Given a list of tuples, check that all tuple elements except for first one match. E.g.
[('STA.N', 55.3, 103.2), ('STA.E', 55.3, 103.2)]
would be valid because the coordinates at the two channels are the same.
- unescape_s(s)[source]¶
Unescape backslash-escaped single-quotes and backslashes.
Example:
Jack\'s
=>Jack's
- escape_d(s)[source]¶
Backslash-escape double-quotes and backslashes.
Example:
"Hello \O/"
=>\"Hello \\O/\"
- unescape_d(s)[source]¶
Unescape backslash-escaped double-quotes and backslashes.
Example:
\"Hello \\O/\"
=>"Hello \O/"
- qjoin_s(it, sep=None)[source]¶
Join sequence of strings into a line, single-quoting non-trivial strings.
Example:
["55", "Sparrow's Island"]
=>"55 'Sparrow\\'s Island'"
- qjoin_d(it, sep=None)[source]¶
Join sequence of strings into a line, double-quoting non-trivial strings.
Example:
['55', 'Pete "The Robot" Smith']
=>'55' "Pete \\"The Robot\\" Smith"'