Background
It is often desirable to run an Anchor task on a subset of files (either randomly chosen, or the first n inputs in order).
This can be:
- temporarily, testing a command before running it on the entire set of inputs.
- permanently, extracting a subset into another directory.
- permanently with anonymization - like above, but the filenames lose identifying information.
Anchor provides a series of input command options, that can be combined to achieve this for many tasks.
It also provides the anonymize
predefined-task.
Subsetting inputs
Taking the initial n inputs in order
Use the -il
command option to restrict the total number of inputs.
e.g. to form a montage of the initial 7 files (maximally).
anchor -il 7 -t montage
Taking a % of all inputs.
The -il
command option also accepts a floating-point in the range (0.0 < floating_value < 1.0
), indicating a percentage of the total number of inputs.
e.g. to convert the initial 20%
of all inputs (maximally).
anchor -il 0.2 -t convert
A random subset
Use the -ir
command option to take a random-sample, taking a similar argument as -il
.
e.g. to copy a random subset:
anchor -ir 20 -t copy # a random subset of 20 files (maximally)
anchor -ir 0.5 -t copy # a random subset of 50% of the total number of inputs
Internally, -ir <arg>
is equivalent to -il <arg> -is
.
Anonymizing
Outputting as a numeric sequence
The easiest anonymizaton occurs by combining the -il
and -is
command input options (see above) with the -on
command output options, which writes files as an incrementing numeric series.
anchor -ir 20 -on -t convert
The anonymize
predefined task
Alternatively, the anonymize
predefined-task is similar to the copy
task, but will automatically anonymize the names.
anchor -t anonymize # anonymize all the inputs
anchor -ir 0.3 -t anonymize # anonymize 30% of the inputs