Edit me

Background

It is often desirable to run an Anchor task on a subset of files (either randomly chosen, or the first n inputs in order).

This can be:

  • temporarily, testing a command before running it on the entire set of inputs.
  • permanently, extracting a subset into another directory.
  • permanently with anonymization - like above, but the filenames lose identifying information.

Anchor provides a series of input command options, that can be combined to achieve this for many tasks.

It also provides the anonymize predefined-task.

Subsetting inputs

Taking the initial n inputs in order

Use the -il command option to restrict the total number of inputs.

e.g. to form a montage of the initial 7 files (maximally).

anchor -il 7 -t montage

Taking a % of all inputs.

The -il command option also accepts a floating-point in the range (0.0 < floating_value < 1.0), indicating a percentage of the total number of inputs.

e.g. to convert the initial 20% of all inputs (maximally).

anchor -il 0.2 -t convert

A random subset

Use the -ir command option to take a random-sample, taking a similar argument as -il.

e.g. to copy a random subset:

anchor -ir 20 -t copy		# a random subset of 20 files (maximally)
anchor -ir 0.5 -t copy		# a random subset of 50% of the total number of inputs

Internally, -ir <arg> is equivalent to -il <arg> -is.

Anonymizing

Outputting as a numeric sequence

The easiest anonymizaton occurs by combining the -il and -is command input options (see above) with the -on command output options, which writes files as an incrementing numeric series.

anchor -ir 20 -on -t convert

The anonymize predefined task

Alternatively, the anonymize predefined-task is similar to the copy task, but will automatically anonymize the names.

anchor -t anonymize			# anonymize all the inputs
anchor -ir 0.3 -t anonymize		# anonymize 30% of the inputs