Content
Squirrel tutorial¶
TODO: In this tutorial we will download data for the … Event…
Downloading data¶
Squirrel offers transparent download of seismic waveforms and station metadata from FDSN web services. With an appropriate dataset configuration this can happen just
We first create a local Squirrel environment, so that all the downloaded files
as well as the database are stored in the current directory under
.squirrel/
. This will make it easier to clean up when we are done (rm
-rf .squirrel/
). If we omit this step, the user’s global Squirrel environment
(~/.pyrocko/cache/squirrel/
) is used.
Create local environment (optional):
TODO: picture
$ squirrel init
To use a remote data source we can create a dataset description file and pass
this to the --dataset
option of the various squirrel
subcommands.
Examples of such dataset description files are provided by the squirrel
template
command. By chance there already is an example for accessing all LH
channels from BGR’s FDSN web service! We can save the example dataset
description file with
$ squirrel template bgr-gr-lh.dataset -w
squirrel:psq.cli.template - INFO - File written: bgr-gr-lh.dataset.yaml
The dataset description is a nicely commented YAML file and we could modify it to our liking.
--- !squirrel.Dataset
# All file paths given below are treated relative to the location of this
# configuration file. Here we may give a common prefix. For example, if the
# configuration file is in the sub-directory 'PROJECT/config/', set it to '..'
# so that all paths are relative to 'PROJECT/'.
path_prefix: '.'
# Data sources to be added (LocalData, FDSNSource, CatalogSource, ...)
sources:
- !squirrel.FDSNSource
# URL or alias of FDSN site.
site: bgr
# Uncomment to let metadata expire in 10 days:
#expires: 10d
# Waveforms can be optionally shared with other FDSN client configurations,
# so that data is not downloaded multiple times. The downside may be that in
# some cases more data than expected is available (if data was previously
# downloaded for a different application).
#shared_waveforms: true
# FDSN query arguments to make metadata queries.
# See http://www.fdsn.org/webservices/fdsnws-station-1.1.pdf
# Time span arguments should not be added here, because they are handled
# automatically by Squirrel.
query_args:
network: 'GR'
channel: 'LH?'
Expert users can get a non-commented version of the file by adding --format
brief
to the squirrel template
command.
Now we tell squirrel to update the meta-information for the time interval of
interest. This is done with the squirrel update
command. Channel
information intersecting with the given time interval will be downloaded.
TODO: picture
$ squirrel update --dataset bgr-gr-lh.dataset.yaml --tmin 2021-07-28 --tmax 2021-08-01
[...]
squirrel update:psq.client.fdsn - INFO - FDSN "bgr" metadata: querying...
squirrel update:psq.client.fdsn - INFO - FDSN "bgr" metadata: new (expires: never)
[...]
squirrel update:psq.cli.update - INFO - Squirrel stats:
Number of files: 2
Total size of known files: 87 kB
Number of index nuts: 160
Available content kinds:
channel: 120 1991-09-01 00:00:00.000 - <none>
station: 40 <none> - <none>
Available codes:
GR.AHRW..LHE GR.AHRW..LHN GR.AHRW..LHZ GR.AHRW.* GR.ASSE..LHE GR.ASSE..LHN
GR.ASSE..LHZ GR.ASSE.* GR.BFO..LHE GR.BFO..LHN
[140 more]
GR.UBR..LHZ GR.UBR.* GR.WET..LHE GR.WET..LHN GR.WET..LHZ GR.WET.*
GR.ZARR..LHE GR.ZARR..LHN GR.ZARR..LHZ GR.ZARR.*
Sources:
client:fdsn:b3ad21f2a866c178889cfdf4f493eba588a59543
Operators: <none>
After fetching the meta information from the FDSN web service, a brief overview of the contents currently known to Squirrel is printed.
If we run the update command a second time, Squirrel informs us that cached metadata has been used:
$ squirrel update --dataset bgr-gr-lh.dataset.yaml --tmin 2021-07-28 --tmax 2021-08-01
[...]
squirrel update:psq.client.fdsn - INFO - FDSN "bgr" metadata: using cached (expires: never)
[...]
It is possible to set an expiration date for the metadata in the dataset configuration.
If we later need the instrument response information of the seismic stations of
the data selection, we can add the --responses
option to the update
command:
TODO: picture
$ squirrel update --responses --dataset bgr-gr-lh.dataset.yaml --tmin 2021-07-28 --tmax 2021-08-01
[...]
Available content kinds:
channel: 120 1991-09-01 00:00:00.000 - <none>
response: 150 1991-01-01 00:00:00.000 - <none>
station: 40 <none> - <none>
[...]
Now we also have response information which contains details about how the seismometers convert physical ground motion into measurement records.
Next we must give permission to Squirrel to download data given certain
constraints. Squirrel will only download waveform data when it has a so-called
promise for a given time span and channel. These promises must be explicitly
created with the --promises
option of squirrel update
. We are only
interested in vertical component seismograms at this point, so we restrict
promise creation to channels ending in ‘Z’.
TODO: picture (local channels with Z are marked blue)
$ squirrel update --promises --dataset bgr-gr-lh.dataset.yaml --tmin 2021-07-28 --tmax 2021-08-01 --codes '*.*.*.??Z'
[...]
Available content kinds:
channel: 120 1991-09-01 00:00:00.000 - <none>
station: 40 <none> - <none>
waveform_promise: 40 2021-07-28 00:00:00.000 - 2021-08-01 00:00:00.000
[...]
To actually download the waveforms, we can now use the squirrel summon
command.
TODO: picture (waveforms are added).
$ squirrel summon --dataset bgr-gr-lh.dataset.yaml --tmin 2021-07-28 --tmax 2021-08-01
Finally we can have a look at the data.
$ squirrel snuffler --dataset bgr-gr-lh.dataset.yaml
TODO: The M8.2 Alaska earthquake is at TIME …
Waveforms are always downloaded in blocks of reasonable size, therefore the downloaded time frame may be slightly larger than the requested time span.
Dataset conversion¶
So far the data has been downloaded into a special cache directory maintained by Squirrel. Using the data from there is useful if we will later add more waveforms. However, sometimes we want to create our own waveform archive in a portable form.
TODO: picture
To copy the data downloaded in the previous section into a handy directory
structure, we can use the squirrel jackseis
command. With its
--out-sds-path
a standard SDS data directory with
day-files in MSEED format is created.
$ squirrel jackseis --dataset bgr-gr-lh.dataset.yaml --out-sds-path data/sds
$ tree data/ # Use `ls`, if `tree` is not installed.
data/
└── sds
└── 2021
└── GR
├── BFO
│ └── LHZ.D
│ ├── GR.BFO..LHZ.D.2021.208
│ ├── GR.BFO..LHZ.D.2021.209
│ ├── GR.BFO..LHZ.D.2021.210
│ ├── GR.BFO..LHZ.D.2021.211
│ ├── GR.BFO..LHZ.D.2021.212
│ └── GR.BFO..LHZ.D.2021.213
├── ...
We will use this dataset as a “local dataset” in the following sections.
TODO: add metadata export TODO: picture
$ squirrel jackseis --dataset bgr-gr-lh.dataset.yaml --out-meta-path meta/stations.xml
Local datasets¶
To inspect some local data holdings, we can use the Snuffler application. Add files and directories to /
$ squirrel snuffler --add data/sds meta/stations.xml
$ quirrel template local.dataset
file listing
explain path_prefix
squirrel snuffler –dataset config/local.dataset.yaml
Dataset inspection¶
squirrel scan –dataset config/local.dataset.yaml
squirrel coverage –dataset config/local.dataset.yaml
squirrel codes –dataset config/local.dataset.yaml
squirrel nuts –dataset config/local.dataset.yaml –codes ‘.BFO..*’
squirrel files –dataset config/local.dataset.yaml –codes ‘.BFO..*’