Set up environment¶

Auptimizer needs to be initialized properly before use, either through python -m aup.setup for interactive setup or python -m aup.setup <config.ini> with the specified configuration file. We recommend users use the interactive setup when using auptimizer for the first time.

The configuration file is used to set up all environment-related information for Auptimizer, such as number of CPUs, GPUs, or remote servers. All your experiments and jobs will use the configuration.

Configuration Options¶

Two templates can be found at Examples/2dfunc_diff_res/*.ini. The detailed options are as follows.

Environment Template File (*.ini)¶

For a normal workstation, an example template is at Examples/2dfunc_diff_res/env_user_template.ini for direct use. The command below will create a user environment profile (at ~/.aup/env.ini) for quick-start:

python -m aup.setup ./Examples/2dfunc_diff_res/env_user_template.ini

or to create a local environment folder for a particular experiment (in working folder ./.aup/env.ini), use:

python -m aup.setup ./Examples/2dfunc_diff_res/env_local_template.ini

The content of the template is the following:

Name	Purpose	Default value
Auptimizer_PATH	configuration folder for Auptimizer, containing this `env.ini` file and database file	`.aup`
SQL_ENGINE	database engine, currently only supports sqlite	`sqlite`
TMP_FOLDER	temp folder for logging / scratch	`/tmp/aup` or `./aup_tmp`

Available resources for model training are specified by additional arguments, or via interactive questions.

Argument	Default	Details
–aws	none	a file containing `user@instance-id`, or a comma-separated list
–cpu	4	Number of processors to use on a single machine.
–gpu	none	a file containing the GPU id number on a machine, or a comma-separated list.
–node	none	a file containing `user@IP` for training, or a comma-separated list

AWS Mapping Configuration (–aws)¶

Auptimizer has the capability to start and stop AWS instances and assign jobs dynamically to the instances depending on availability. To allow fragmented resource utilization within the AWS instance, edit the resources available to Auptimizer when installing it on the remote instance.

1. To use the Auptimizer’s AWS option, install and configure boto3 and ensure your AWS instances can be accessed with boto3. Then proceed to install configure AWS CLI.

2. To run jobs on multiple AWS instances, we need to specify an AWS mapping file with –aws <aws_mapping_file>. Similar to the above-mentioned node case, <aws_mapping_file> is a text file containing the username, AWS instance id and the SSH key file (.pem file).

The format of the file must be one of the following:

<username>@<instance_id> <ssh_key_file>

<username>@<instance_id>

<username>@<instance_id>:port

<username>@<instance_id>:port <ssh_key_file>

with each instance on a separate line, or if you setup the configuration interactively, you can provide them as comma-separated values. The username is the one used to log into the instance. The instance_id can be the EC2 instance ID, IP, or public DNS.

Refer to Examples/2dfunc_diff_res/aws.txt for an example.

3. Either you need to install Auptimizer on every node, or you need to copy <repo>/src/aup.py to your remote working directory.

4. Finally, the environment of the instances might be sourced correspondingly for environment variables, e.g. PATH. For instance, if you want to activate virtualenv on the instance before job running, use prescript in experiment.json (See Additional runtime configuration for Node/AWS for more detail).

CPU Configuration (`--cpu`)¶

We can specify how many parallel jobs to be run on CPUs. The current implementation does NOT provide real isolation of hardware.

GPU Mapping Configuration (`--gpu`)¶

To run jobs on multiple GPUs, we need to specify a GPU mapping file with --gpu <gpu_mapping_file>. This file is a text file containing the IDs of GPU cards to use (in CUDA_VISIBLE_DEVICES). Specifically, each line contains a GPU id as an integer.

For single GPU without parallel execution, use Examples/2dfunc_diff_res/plainGPU.txt.
Assign multiple jobs to run on the same device by assigning multiple resource IDs to the same GPU id, i.e. 0,0). See Examples/2dfunc_diff_res/singleGPU.txt.
Assign multiple jobs on different GPUs on a local machine, i.e.:
```
0
1
```
See Examples/2dfunc_diff_res/twoGPUs.txt.

Node Mapping Configuration (`--node`)¶

Similar to GPU mapping, users can specify the computing nodes to be used with --node <node mapping file> during setup

A node configuration file contains each node per line (e.g., username@hostname.) An SSH connection without password authentication is required (use ssh-keygen to create keyless access).

The format of the file must be one of the followings:

<username>@<IP> <ssh_key_file>
<username>@<IP>
<username>@<IP>:port
<username>@<IP>:port <ssh_key_file>

with each instance on a separate line, or if you setup the configuration interactively, you can provide them as comma-separated values.

For remote execution, Auptimizer will not copy all job-related files to the remote machine. User should make sure the job script can run on the remote machine first.

Refer to Examples/2dfunc_diff_res/node.txt for an example.

Either you need to install Auptimizer on every node, or you need to copy <repo>/src/aup.py to your remote working directory.

Optional arguments¶

--overwrite - overwrite existing .aup folder. Otherwise, do nothing
--log - choose log level from [debug,info,warn,error]
--user - not used. It specifies the user ownership for experiments.

Examples¶

Please refer to Examples/2dfunc_diff_res/README.md for examples of how to use different resources with Auptimizer.

Database Setup¶

During the setup, Auptimizer creates a SQL database to track the jobs and experiments (currently only sqlite is supported). Typically, users do not need to manually access it. Here we provide a little more detail for users to retrieve additional records for their analyses.

The database schema is described below:

SQL Schema¶

Refresh tables¶

To (re)create the database, users just need to follow the command printed after python -m aup.setup:

python -m aup.setupdb .aup/env.ini

This will parse the .aup/env.ini file to create the new database.

Refresh tables with additional modification¶

The following argument can be customized by changing it in the headings or from the command line to overwrite the aup.setup configuration:

Name	Purpose	Default value
–cpu	Number of CPUs to be used.	4
–name	Name for resources	“localhost” , Currently not in use.
–user	username for experiment tracking and control	Currently not in use.

For example, if user wants to set maximum 2 jobs to be run on CPU in parallel, under name test:

python -m aup.setupdb <path to env.ini> --cpus 2 --user test

Then you can see the allocated resource table as:

echo "select * from resource;" | sqlite3 .aup/sqlite3.db

test|gpu|free
test|cpu|free
test|cpu|free
test|passive|free

Reset¶

To reset the history of Auptimizer experiments, there are two levels.

Full reset¶

Removing .aup folder will completely remove all history saved by Auptimizer as well as any configurations.

Using python -m aup.setup --overwrite <config.ini> will overwrite the existing folder and remove only the history.

Database reset¶

Currently, Auptimizer experiments and jobs history is saved in .aup/sqlite.db database. There are two levels to reset the database itself.

Reset all¶

We can also just refresh the database for the history by resetting the database file.

Use (locally in working folder):

python -m aup.setupdb .aup/env.ini

or (user account-wise):

python -m aup.setupdb ~/.aup/env.ini

Reset job status¶

Sometimes, when Auptimizer accidentally exits, the resources are not marked as free in the database and will prevent you from using them within Auptimizer. In such scenarios, you might want to reset the status of resources by:

python -m aup.setupdb.reset env.ini