DB Methods

Database methods

class iceprod.server.dbmethods.CacheInfo

CacheInfo(hits, misses, maxsize, currsize)

Create new instance of CacheInfo(hits, misses, maxsize, currsize)

currsize

Alias for field number 3

hits

Alias for field number 0

maxsize

Alias for field number 2

misses

Alias for field number 1

iceprod.server.dbmethods.memcache(size=1024, ttl=None)[source]

Caching decorator.

Instrumented like functools.lru_cache() with cache_clear() and cache_info().

Parameters:
  • size (int) – Max number of entries
  • ttl (int) – Time to live (default None = infinite)
iceprod.server.dbmethods.authorization(**kwargs)[source]

Authorization decorator.

Must be used on a member function of dbmethods in order to access the DB for authorization information.

Non-decorator optional args:
passkey (str): the passkey (for site, task, or user) cookie_id (str): the user_id supplied by a cookie site_id (str): the site id
Parameters:
  • site (bool) – Valid for site queries
  • user (str, list, callable) – The user id(s) to match against
  • role (str, list, callable) – The role id(s) to match against
  • expression (str) – The expression of site, user, and role
iceprod.server.dbmethods.filtered_input(input_data)[source]

Filter input to sql in cases where we can’t use bindings. Just remove all ” ‘ ; : ? characters, since those won’t be needed in proper names

iceprod.server.dbmethods.datetime2str(dt)[source]

Convert a datetime object to ISO 8601 string

iceprod.server.dbmethods.nowstr()[source]

Get an ISO 8601 string of the current time in UTC

iceprod.server.dbmethods.str2datetime(st)[source]

Convert a ISO 8601 string to datetime object

iceprod.server.dbmethods.dbmethod(func)

Authorization and security database methods

class iceprod.server.dbmethods.auth.auth(parent)[source]

The authorization / security DB methods.

Takes a handle to a subclass of iceprod.server.modules.db.DBAPI as an argument.

auth_get_site_auth(*args, **kwargs)
auth_authorize_site(*args, **kwargs)
auth_authorize_task(*args, **kwargs)
auth_new_passkey(*args, **kwargs)[source]

Make a new passkey. Default expiration in 1 hour.

auth_get_passkey(*args, **kwargs)
add_site_to_master(*args, **kwargs)[source]

Add a remote site to the master and return a new passkey

auth_user_create(*args, **kwargs)[source]
auth_user(*args, **kwargs)[source]

Authenticate a username and password

RPC database methods

class iceprod.server.dbmethods.rpc.rpc(parent)[source]

The RPC DB methods.

rpc_echo(value, callback=None)[source]

Echo a single value. Just a test to see if rpc is working

rpc_new_task(*args, **kwargs)[source]

Get new task(s) from the queue specified by the gridspec, based on the hostname, network interfaces, resources. Save hostname,network in nodes table.

Returns:a list of job configs (dicts)
Return type:list
rpc_set_processing(task_id)[source]

Set a task to the processing status

Parameters:task_id (str) – task_id
rpc_finish_task(*args, **kwargs)[source]

Do task completion operations.

Parameters:
  • task_id (str) – task_id
  • stats (dict) – statistics from task
rpc_task_error(*args, **kwargs)[source]

Mark task as ERROR and possibly adjust resources.

Parameters:
  • task_id (str) – task id
  • error_info (dict) – error information
rpc_upload_logfile(*args, **kwargs)[source]

Uploading of a logfile from a task

rpc_stillrunning(*args, **kwargs)[source]

Check that the task is still in a running state.

Running states are “queued” or “processing”. Queued is allowed because of possible race conditions around changing status to processing.

Parameters:task_id – task id
Returns:True or False
Return type:bool
rpc_update_pilot(*args, **kwargs)[source]

Update the pilot table.

Parameters:
  • pilot_id (str) – Id of the pilot to update.
  • tasks (str) – csv list of tasks
  • resources_available (dict) – {resource:value}
  • resources_claimed (dict) – {resource:value}
rpc_submit_dataset(*args, **kwargs)[source]

Submit a dataset.

Parameters:
  • config (dict) – A config object
  • difplus (str) – A serialized difplus
  • description (str) – The dataset description
  • gridspec (str) – The grid to run on
  • njobs (int) – Number of jobs to submit
  • stat_keys (list) – Statistics to keep
  • debug (bool) – Debug flag (default False)
rpc_update_dataset_config(*args, **kwargs)[source]

Update a dataset config

Parameters:
  • dataset_id (str) – dataset id
  • config (str or dict) – config
rpc_get_groups(*args, **kwargs)[source]

Get all the groups.

Returns:{group_id: group}
Return type:dict
rpc_set_groups(*args, **kwargs)[source]

Set all the groups.

Parameters:
  • user (str) – user_id for authorization
  • groups (dict) – groups to update
rpc_get_user_roles(*args, **kwargs)[source]

Get the roles a username belongs to.

Parameters:username (str) – user name
Returns:{role_id: role}
Return type:dict
rpc_set_user_roles(*args, **kwargs)[source]

Set the roles of a username.

Parameters:
  • user (str) – user id for authorization
  • username (str) – user name to modify roles on
  • roles (iterable) – roles to set
rpc_queue_master(*args, **kwargs)[source]

Handle global queueing request from a site.

For a task to queue on a site, it must be matched in the dataset gridspec list (or the list should be empty to match all), and the necessary resources should be available on the site.

Parameters:
  • resources (dict) – (optional) the available resources on the site
  • filters (dict) – (optional) group filters on the site
  • queueing_factor_priority (float) – (optional) queueing factor for priority
  • queueing_factor_dataset (float) – (optional) queueing factor for dataset id
  • queueing_factor_tasks (float) – (optional) queueing factor for number of tasks
  • num (int) – (optional) number of tasks to queue
Returns:

table entries to be merged

Return type:

dict

rpc_master_update(*args, **kwargs)[source]
rpc_master_get_tables(*args, **kwargs)[source]

Get a dump of selected tables from the master.

Parameters:tablenames (iterable) – An iterable of table names.
Returns:Dictionary of tables
Return type:dict
rpc_stop_module(module_name)[source]
rpc_start_module(module_name)[source]
rpc_update_config(config_text)[source]
rpc_reset_task(*args, **kwargs)[source]
rpc_resume_task(*args, **kwargs)[source]
rpc_suspend_task(tasks)[source]
rpc_reset_jobs(*args, **kwargs)[source]
rpc_hard_reset_jobs(*args, **kwargs)[source]
rpc_suspend_jobs(*args, **kwargs)[source]
rpc_reset_dataset(*args, **kwargs)[source]
rpc_hard_reset_dataset(*args, **kwargs)[source]
rpc_suspend_dataset(*args, **kwargs)[source]
rpc_truncate_dataset(*args, **kwargs)[source]
rpc_public_get_graphs(*args, **kwargs)[source]

Get the graph data for a length of time.

Parameters:start (int) – Amount of minutes in the past to start grabbing
Returns:[{name, value, timestamp}]
Return type:list
rpc_public_get_number_of_tasks_in_each_state(*args, **kwargs)[source]
rpc_public_get_datasets_by_status(*args, **kwargs)[source]
rpc_public_get_config(*args, **kwargs)[source]
rpc_public_get_all_config(*args, **kwargs)[source]
rpc_public_get_task_stats(*args, **kwargs)[source]
rpc_public_get_task_ids(*args, **kwargs)[source]
rpc_public_get_dataset_description(*args, **kwargs)[source]
rpc_public_get_dataset_steering(*args, **kwargs)[source]
rpc_public_get_task_walltime(*args, **kwargs)[source]
rpc_public_get_tasks_by_name(*args, **kwargs)[source]
rpc_public_get_tasks_by_requirements(*args, **kwargs)[source]
rpc_public_get_dataset_completion(*args, **kwargs)[source]
rpc_public_get_all_dataset_completion(*args, **kwargs)[source]
rpc_public_get_site_id(*args, **kwargs)[source]
rpc_public_get_cpu_gpu_usage(*args, **kwargs)[source]

Website database methods

class iceprod.server.dbmethods.web.web(parent)[source]

The website DB methods.

web_get_tasks_by_status(*args, **kwargs)[source]

Get the number of tasks in each state on this site and plugin.

Parameters:
  • gridspec (str) – grid and plugin id
  • dataset_id (str) – dataset id
Returns:

{status:num}

Return type:

dict

web_get_datasets(*args, **kwargs)[source]

Get the number of datasets in each state on this site and plugin.

Filters are specified as key=[‘match1’,’match2’]

Parameters:
  • gridspec (str) – grid and plugin id
  • groups (iterable) – Fields to group by
  • **filters (dict) – (optional) filters for the query
Returns:

[{dataset}]

Return type:

list

web_get_datasets_details(*args, **kwargs)[source]

Get the number of datasets in each state on this site and plugin.

Parameters:
  • dataset_id (str) – dataset id
  • status (str) – dataset status
  • gridspec (str) – grid and plugin id
Returns:

{status:num}

Return type:

dict

web_get_tasks_details(*args, **kwargs)[source]

Get the number of tasks in each state on this site and plugin.

Parameters:
  • task_id (str) – task id
  • status (str) – task status
  • gridspec (str) – grid and plugin id
  • dataset_id (str) – dataset id
Returns:

{status:num}

Return type:

dict

web_get_logs(*args, **kwargs)[source]

Get the logs for a task.

Parameters:
  • task_id (str) – task id
  • lines (int) – tail this number of lines (default: all lines)
Returns:

{log_name:text}

Return type:

dict

web_get_gridspec(*args, **kwargs)[source]

Get the possible gridspecs that we know about.

Returns:{gridspecs}
Return type:dict
web_get_sites(**kwargs)[source]

Get sites matching kwargs

web_get_dataset_by_name(*args, **kwargs)[source]

Get a dataset by its name.

Parameters:name (str) – dataset name
Returns:dataset id
Return type:str
web_get_task_completion_stats(*args, **kwargs)[source]

Get the task completion stats for a dataset.

Columns:
task_name task_type num_queued num_running num_completions avg_runtime max_runtime min_runtime error_count efficiency
Parameters:dataset_id (str) – dataset id
Returns:{task_name: {column: num} }
Return type:dict
web_get_job_counts_by_status(*args, **kwargs)[source]

Get count of jobs by status.

Parameters:
  • status (str) – status to restrict by
  • dataset_id (str) – dataset id
Returns:

{status: count}

Return type:

dict

web_get_jobs_by_status(*args, **kwargs)[source]

Get basic job info.

Parameters:
  • status (str) – status to restrict by
  • dataset_id (str) – dataset id
Returns:

[job_info]

Return type:

dict

web_get_jobs_details(*args, **kwargs)[source]

Get job details for a job_id.

Parameters:job_id (str) – job_id
Returns:{job_id:details}
Return type:dict

Node database methods

class iceprod.server.dbmethods.node.node(parent)[source]

The node DB methods.

Takes a handle to a subclass of iceprod.server.modules.db.DBAPI as an argument.

node_update(*args, **kwargs)[source]

Update node data.

Parameters:
  • hostname (str) – hostname of node
  • domain (str) – domain of node
  • **kwargs – gridspec and other statistics
node_collate_resources(*args, **kwargs)[source]

Collate node resources into site resources.

Parameters:
  • site_id (str) – The site to assign resources to
  • node_include_age (int) – The number of days a node can age before not being included
node_get_site_resources(*args, **kwargs)[source]

Get all resources for a site.

Parameters:
  • site_id (str) – The site to examine
  • empty_only (bool) – Get only the empty resources, defaults to True
Returns:

resources

Return type:

dict

Cron database methods

class iceprod.server.dbmethods.cron.cron(parent)[source]

The scheduled (cron) DB methods.

Takes a handle to a subclass of iceprod.server.modules.db.DBAPI as an argument.

cron_dataset_completion(*args, **kwargs)[source]

Check for newly completed datasets and mark them as such

cron_job_completion(*args, **kwargs)[source]

Check for job status changes.

If this is the master, mark jobs complete, suspended, or failed as necessary. Completed jobs also delete the job temp space.

If this is not the master, and if all tasks in a job are not in an active state, then delete the job and tasks.

cron_clean_completed_jobs(*args, **kwargs)[source]

Check old files in the dagtemp from completed jobs

cron_remove_old_passkeys()[source]
cron_generate_web_graphs(*args, **kwargs)[source]
cron_pilot_monitoring(*args, **kwargs)[source]
cron_dataset_update(*args, **kwargs)[source]

Update the dataset table on clients

cron_suspend_overusage_tasks(*args, **kwargs)[source]

Suspend very high resource usage tasks

cron_check_active_pilots_tasks(*args, **kwargs)[source]

Reset processing tasks that are not listed as running by an active pilot.

cron_dataset_status_monitoring(*args, **kwargs)[source]

Monitor all datasets for job/task status summary.

cron_task_stat_monitoring(*args, **kwargs)[source]

Monitor task statistics in ES.

cron_task_monitoring(*args, **kwargs)[source]

Monitor task status in ES.

Misc database methods

class iceprod.server.dbmethods.misc.misc(parent)[source]

misc DB methods.

Takes a handle to a subclass of iceprod.server.modules.db.DBAPI as an argument.

misc_site_to_site_upload(src, dest)[source]
misc_get_tables_for_task(*args, **kwargs)[source]

Get all tables necessary to run task(s).

Parameters:task_ids (iterable) – An iterable of task_ids
Returns:table entries
Return type:dict
misc_update_tables(*args, **kwargs)[source]

Update the DB tables with the incoming information.

Parameters:tables (dict) – {table_name:{keys:[],values:[[]]}}
Returns:success or failure
Return type:bool
misc_update_master_db(*args, **kwargs)[source]

Update the DB with incoming information (query provided).

Parameters:
  • table (str) – The table affected
  • index (str) – That table’s index id
  • timestamp (str) – An ISO 8601 UTC timestamp
  • sql (str) – An sql statement
  • bindings (tuple) – Bindings for the sql statement