Flashnux

GNU/Linux man pages

Livre :
Expressions régulières,
Syntaxe et mise en oeuvre :

ISBN : 978-2-7460-9712-4
EAN : 9782746097124
(Editions ENI)

GNU/Linux

Debian 6.0.2

(Squeeze)

sacct(1)


SACCT

SACCT

NAME
SYNOPSIS
DESCRIPTION
OPTIONS
INTERPRETING THE −DUMP OPTION OUTPUT
EXAMPLES
COPYING
FILES
SEE ALSO

NAME

sacct − displays accounting data for all jobs and job steps in the SLURM job accounting log or SLURM database

SYNOPSIS

sacct [OPTIONS...]

DESCRIPTION

Accounting information for jobs invoked with SLURM are either logged in the job accounting log file or saved to the SLURM database.

The sacct command displays job accounting data stored in the job accounting log file or SLURM database in a variety of forms for your analysis. The sacct command displays information on jobs, job steps, status, and exitcodes by default. You can tailor the output with the use of the −−fields= option to specify the fields to be shown.

For the root user, the sacct command displays job accounting data for all users, although there are options to filter the output to report only the jobs from a specified user or group.

For the non−root user, the sacct command limits the display of job accounting data to jobs that were launched with their own user identifier (UID) by default. Data for other users can be displayed with the −−all, −−user, or −−uid options.

Note:

Much of the data reported by sacct has been generated by the wait3() and getrusage() system calls. Some systems gather and report incomplete information for these calls; sacct reports values of 0 for this missing data. See your systems getrusage(3) man page for information about which data are actually available on your system.

If −−dump is specified, the field selection options (−−brief, −−format, ...) have no effect.

Elapsed time fields are presented as 2 fields, integral seconds and integral microseconds

If −−dump is not specified, elapsed time fields are presented as [[days-]hours:]minutes:seconds.hundredths.

The default input file is the file named in the jobacct_logfile parameter in slurm.conf.

OPTIONS

−a , −−allusers

Displays the current user’s jobs. Displays all users jobs when run by root.

−A account_list , −−accounts=account_list

Displays jobs when a comma separated list of accounts are given as the argument.

−b , −−brief

Displays a brief listing, which includes the following data:

jobid

status

exitcode

This option has no effect when the −−−dump option is also specified.

−C cluster_list, −−cluster=cluster_list

Displays the statistics only for the jobs started on the clusters specified by the cluster_list operand, which is a comma−separated list of clusters. Space characters are not allowed in the cluster_list. −1 for all clusters, default is current cluster you are executing the sacct command on.

−c , −−completion

Use job completion instead of job accounting.

−d , −−dump

Dumps the raw data records.

The section titled "INTERPRETING THE −−dump OPTION OUTPUT" describes the data output when this option is used.

−−duplicates

If SLURM job ids are reset, but the job accounting log file isn’t reset at the same time (with −e, for example), some job numbers will probably appear more than once in the accounting log file to refer to different jobs; such jobs can be distinguished by the "submit" time stamp in the data records.

When data for specific jobs are requested with the −−jobs option, we assume that the user wants to see only the most recent job with that number. This behavior can be overridden by specifying −−duplicates, in which case all records that match the selection criteria will be returned.

−e , −−helpformat

Print a list of fields that can be specified with the −−format option.

Fields available:

AllocCPUS Account AssocID AveCPU
AvePages AveRSS AveVMSize BlockID
Cluster CPUTime CPUTimeRAW Elapsed
Eligible End ExitCode GID
Group JobID JobName Layout
MaxPages MaxPagesNode MaxPagesTask MaxRSS
MaxRSSNode MaxRSSTask MaxVMSize MaxVMSizeNode
MaxVMSizeTask MinCPU MinCPUNode MinCPUTask
NCPUS NNodes NodeList NTasks
Priority Partition QOS QOSRAW
ReqCPUS Reserved ResvCPU ResvCPURAW
Start State Submit Suspended
SystemCPU Timelimit TotalCPU UID
User UserCPU WCKey WCKeyID

The section titled "Job Accounting Fields" describes these fields.

−E end_time, −−endtime=end_time

Select jobs eligible before time. If states are given with the −s option return jobs in this state before this period.

Valid time formats are... HH:MM[:SS] [AM|PM] MMDD[YY] or MM/DD[/YY] or MM.DD[.YY] MM/DD[/YY]-HH:MM[:SS] YYYY-MM-DD[THH:MM[:SS]]

−f file, −−file=file

Causes the sacct command to read job accounting data from the named file instead of the current SLURM job accounting log file. Only applicable when running the filetxt plugin.

−g gid_list, −−gid=gid_list −−group=group_list

Displays the statistics only for the jobs started with the GID or the GROUP specified by the gid_list or thegroup_list operand, which is a comma−separated list. Space characters are not allowed. Default is no restrictions..

−h , −−help

Displays a general help message.

−j job(.step) , −−jobs=job(.step)

Displays information about the specified job(.step) or list of job(.step)s.

The job(.step) parameter is a comma−separated list of jobs. Space characters are not permitted in this list.

The default is to display information on all jobs.

−l, −−long

Equivalent to specifying:

´−−fields=jobid,jobname,partition,maxvsize,maxvsizenode,maxvsizetask,avevsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode,maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks,alloccpus,elapsed,state,exitcode´

−L, −−allclusters

Display jobs ran on all clusters. By default, only jobs ran on the cluster from where sacct is called are displayed.

−n, −−noheader

No heading will be added to the output. The default action is to display a header.

This option has no effect when used with the −−dump option.

−N, −−nodelist

Display jobs that ran on any of these nodes, can be one or more using a ranged string.

−o , −−format

Comma separated list of fields. (use "−−helpformat" for a list of available fields).

NOTE: When using the format option for listing various fields you can put a %NUMBER afterwards to specify how many characters should be printed.

i.e. format=name%30 will print 30 characters of field name right justified. A −30 will print 30 characters left justified.

−O , −−formatted_dump

Dumps accounting records in an easy−to−read format.

This option is provided for debugging.

−p , −−parsable

output will be ’|’ delimited with a ’|’ at the end

−P , −−parsable2

output will be ’|’ delimited without a ’|’ at the end

−r , −−partition

Comma separated list of partitions to select jobs and job steps from. The default is all partitions.

−s state_list , −−state=state_list

Selects jobs based on their current state or the state they were in during the time period given, which can be designated with the following state designators:

r

running

s

suspended

ca

cancelled

cd

completed

pd

pending

f

failed

to

timed out

nf

node_fail

The state_list operand is a comma−separated list of these state designators. Space characters are not allowed in the state_list NOTE: When specifying states and no start time is given the default starttime is ’now’. .

−S , −−starttime

Select jobs eligible after the specified time. Default is midnight of current day. If states are given with the −s option then return jobs in this state at this time, ’now’ is also used as the default time.

Valid time formats are... HH:MM[:SS] [AM|PM] MMDD[YY] or MM/DD[/YY] or MM.DD[.YY] MM/DD[/YY]-HH:MM[:SS] YYYY-MM-DD[THH:MM[:SS]]

−T , −−truncate

Truncate time. So if a job started before −−starttime the start time would be truncated to −−starttime. The same for end time and −−endtime.

−u uid_list, −−uid=uid_list −−user=user_list

Use this comma separated list of uids or user names to select jobs to display. By default, the running user’s uid is used.

−−usage

Displays a help message.

−v , −−verbose

Primarily for debug use reports the state of certain variables during processing.

−V , −−version

Print version.

−W wckey_list, −−wckeys=wckey_list

Displays the statistics only for the jobs started on the wckeys specified by the wckey_list operand, which is a comma−separated list of wckey names. Space characters are not allowed in the wckey_list. Default is all wckeys.

−x associd_list, −−associations=assoc_list

Displays the statistics only for the jobs running under the association ids specified by the assoc_list operand, which is a comma−separated list of association ids. Space characters are not allowed in the assoc_list. Default is all associations.

−X , −−allocations

Only show cumulative statistics for each job, not the intermediate steps.

Job Accounting Fields
The following describes each job accounting field:

alloccpus

Count of allocated processors.

account

Account the job ran under.

associd

Reference to the association of user, account and cluster.

avecpu

Average CPU time of a process.

avepages

Average pages of a process.

averss

Average resident set size of a process.

avevsize

Average Virtual Memory size of a process.

blockid

Block ID, applicable to BlueGene computers only.

cluster

Cluster name.

cputime

Formatted number of cpu seconds a process was allocated.

cputimeraw

How much cpu time process was allocated in second format, not formatted like above.

elapsed

The jobs elapsed time.

The format of this fields output is as follows:

[DD−[hh:]]mm:ss

as defined by the following:

DD

days

hh

hours

mm

minutes

ss

seconds

eligible

When the job became eligible to run.

end

Termination time of the job. Format output is as follows:

MM/DD−hh:mm:ss

as defined by the following:

MM

month

DD

day

hh

hours

mm

minutes

ss

seconds

exitcode

The first non−zero error code returned by any job step.

gid

The group identifier of the user who ran the job.

group

The group name of the user who ran the job.

jobid

The number of the job or job step. It is in the form: job.jobstep.

jobname

The name of the job or job step. The slurm_accounting.log file is a space delimited file. Because of this if a space is used in the jobname an underscore is substituted for the space before the record is written to the accounting file. So when the jobname is displayed by sacct the jobname that had a space in it will now have an underscore in place of the space.

layout

What the layout of a step was when it was running. This can be used to give you an idea of which node ran which rank in your job.

maxpages

Maximum page faults of a process.

maxpagesnode

The node where the maxpages occured.

maxpagestask

The task on maxpagesnode where the maxpages occured.

maxrss

Maximum resident set size of a process.

maxrssnode

The node where the maxrss occured.

maxrsstask

The task on maxrssnode where the maxrss occured.

maxvmsize

Maximum Virtual Memory size of any process.

maxvmsizenode

The node where the maxvsize occured.

maxvmsizetask

The task on maxvsizenode where the maxvsize occured.

mincpu

Minimum cpu of any process.

mincpunode

The node where the mincpu occured.

mincputask

The task on mincpunode where the mincpu occured.

ncpus

Total number of CPUs allocated to the job.

nodelist

List of nodes in job/step.

nnodes

Number of nodes in a job or step.

ntasks

Total number of tasks in a job or step.

priority

Slurm priority.

partition

Identifies the partition on which the job ran.

qos

Name of Quality of Service.

qosraw

Id of Quality of Service.

reqcpus

Required CPUs.

reserved

How much wall clock time was used as reserved time for this job. This is derived from how long a job was waiting from eligible time to when it actually started.

resvcpu

Formatted time for how long (cpu secs) a job was reserved for.

resvcpuraw

Reserved CPUs in second format, not formatted.

start

Initiation time of the job in the same format as end.

state

Displays the job status, or state.

Output can be RUNNING, SUSPENDED, COMPLETED, CANCELLED, FAILED, TIMEOUT, or NODE_FAIL.

submit

The time and date stamp (in Universal Time Coordinated, UTC) the job was submitted. The format of the output is identical to that of the end field.

suspended

How long the job was suspended for.

SystemCPU

The amount of system CPU time used by the job or job step. The format of the output is identical to that of the elapsed field.

NOTE: SystemCPU provides a measure of the task’s parent process and does not include CPU time of child processes.

timelimit

What the timelimit was/is for the job.

TotalCPU

The sum of the SystemCPU and UserCPU time used by the job or job step. The total CPU time of the job may exceed the job’s elapsed time for jobs that include multiple job steps. The format of the output is identical to that of the elapsed field.

NOTE: TotalCPU provides a measure of the task’s parent process and does not include CPU time of child processes.

uid

The user identifier of the user who ran the job.

user

The user name of the user who ran the job.

UserCPU

The amount of user CPU time used by the job or job step. The format of the output is identical to that of the elapsed field.

NOTE: UserCPU provides a measure of the task’s parent process and does not include CPU time of child processes.

wckey

Workload Characterization Key. Arbitrary string for grouping orthogonal accounts together.

wckeyid

Reference to the wckey.

INTERPRETING THE −DUMP OPTION OUTPUT

The sacct commands −−dump option displays data in a horizontal list of fields depending on the record type; there are three record types: JOB_START, JOB_STEP, and JOB_TERMINATED. There is a subsection that describes the output for each record type.

When the data output is a job accounting field, as described in the section titled "Job Accounting Fields", only the name of the job accounting field is listed. Otherwise, additional information is provided.

Note:

The output for the JOB_STEP and JOB_TERMINATED record types present a pair of fields for the following data: Total CPU time, Total User CPU time, and Total System CPU time. The first field of each pair is the time in seconds expressed as an integer. The second field of each pair is the fractional number of seconds multiplied by one million. Thus, a pair of fields output as "1 024315" means that the time is 1.024315 seconds. The least significant digits in the second field are truncated in formatted displays.

Output for the JOB_START Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_START record type.

Field #

Field

1

job

2

partition

3

submitted

4

The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970)

5

uid.gid

6

(Reserved)

7

JOB_START (literal string)

8

Job Record Version (1)

9

The number of fields in the record (16)

10

uid

11

gid

12

The job name

13

Batch Flag (0=no batch)

14

Relative SLURM priority

15

ncpus

16

nodes

Output for the JOB_STEP Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_STEP record type.

Field #

Field

1

job

2

partition

3

submitted

4

The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970)

5

uid.gid

6

(Reserved)

7

JOB_STEP (literal string)

8

Job Record Version (1)

9

The number of fields in the record (38)

10

jobid

11

end

12

Completion Status; the mnemonics, which may appear in uppercase or lowercase, are as follows:

CA

Cancelled

CD

Completed successfully

F

Failed

NF

Job terminated from node failure

R

Running

S

Suspended

TO

Timed out

13

exitcode

14

ntasks

15

ncpus

16

elapsed time in seconds expressed as an integer

17

Integer portion of the Total CPU time in seconds for all processes

18

Fractional portion of the Total CPU time for all processes expressed in microseconds

19

Integer portion of the Total User CPU time in seconds for all processes

20

Fractional portion of the Total User CPU time for all processes expressed in microseconds

21

Integer portion of the Total System CPU time in seconds for all processes

22

Fractional portion of the Total System CPU time for all processes expressed in microseconds

23

rss

24

ixrss

25

idrss

26

isrss

27

minflt

28

majflt

29

nswap

30

inblocks

31

outblocks

32

msgsnd

33

msgrcv

34

nsignals

35

nvcsw

36

nivcsw

37

vsize

Output for the JOB_TERMINATED Record Type
The following describes the horizontal fields output by the sacct −−dump option for the JOB_TERMINATED (literal string) record type.

Field #

Field

1

job

2

partition

3

submitted

4

The jobs start time; this value is the number of non−leap seconds since the Epoch (00:00:00 UTC, January 1, 1970)

5

uid.gid

6

(Reserved)

7

JOB_TERMINATED (literal string)

8

Job Record Version (1)

9

The number of fields in the record (38)

Although thirty−eight fields are displayed by the sacct command for the JOB_TERMINATED record, only fields 1 through 12 are recorded in the actual data file; the sacct command aggregates the remainder.

10

The total elapsed time in seconds for the job.

11

end

12

Completion Status; the mnemonics, which may appear in uppercase or lowercase, are as follows:

CA

Cancelled

CD

Completed successfully

F

Failed

NF

Job terminated from node failure

R

Running

TO

Timed out

13

exitcode

14

ntasks

15

ncpus

16

elapsed time in seconds expressed as an integer

17

Integer portion of the Total CPU time in seconds for all processes

18

Fractional portion of the Total CPU time for all processes expressed in microseconds

19

Integer portion of the Total User CPU time in seconds for all processes

20

Fractional portion of the Total User CPU time for all processes expressed in microseconds

21

Integer portion of the Total System CPU time in seconds for all processes

22

Fractional portion of the Total System CPU time for all processes expressed in microseconds

23

rss

24

ixrss

25

idrss

26

isrss

27

minflt

28

majflt

29

nswap

30

inblocks

31

outblocks

32

msgsnd

33

msgrcv

34

nsignals

35

nvcsw

36

nivcsw

37

vsize

EXAMPLES

This example illustrates the default invocation of the sacct command:

# sacct
Jobid Jobname Partition Account AllocCPUS State ExitCode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−−
2 script01 srun acct1 1 RUNNING 0
3 script02 srun acct1 1 RUNNING 0
4 endscript srun acct1 1 RUNNING 0
4.0 srun acct1 1 COMPLETED 0

This example shows the same job accounting information with the brief option.

# sacct −−brief
Jobid Status Exitcode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−
2 RUNNING 0
3 RUNNING 0
4 RUNNING 0
4.0 COMPLETED 0

# sacct −−allocations
Jobid Jobname Partition Account AllocCPUS State Exitcode
−−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−− −−−−−−−−−− −−−−−−−−
3 sja_init andy acct1 1 COMPLETED 0
4 sjaload andy acct1 2 COMPLETED 0
5 sja_scr1 andy acct1 1 COMPLETED 0
6 sja_scr2 andy acct1 18 COMPLETED 2
7 sja_scr3 andy acct1 18 COMPLETED 0
8 sja_scr5 andy acct1 2 COMPLETED 0
9 sja_scr7 andy acct1 90 COMPLETED 1
10 endscript andy acct1 186 COMPLETED 0

This example demonstrates the ability to customize the output of the sacct command. The fields are displayed in the order designated on the command line.

# sacct −−fields=jobid,ncpus,ntasks,nsignals,status
Jobid Elapsed Ncpus Ntasks Status
−−−−−−−−−− −−−−−−−−−− −−−−−−−−−− −−−−−−−− −−−−−−−−−−
3 00:01:30 2 1 COMPLETED
3.0 00:01:30 2 1 COMPLETED
4 00:00:00 2 2 COMPLETED
4.0 00:00:01 2 2 COMPLETED
5 00:01:23 2 1 COMPLETED
5.0 00:01:31 2 1 COMPLETED

COPYING

Copyright (C) 2005−2007 Copyright Hewlett−Packard Development Company L.P.

Copyright (C) 2008−2009 Lawrence Livermore National Security. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). CODE−OCEC−09−009. All rights reserved.

This file is part of SLURM, a resource management program. For details, see <https://computing.llnl.gov/linux/slurm/>.

SLURM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

FILES

/etc/slurm.conf

Entries to this file enable job accounting and designate the job accounting log file that collects system job accounting.

/var/log/slurm_accounting.log

The default job accounting log file. By default, this file is set to read and write permission for root only.

SEE ALSO

sstat(1), ps(1), srun(1), squeue(1), getrusage(2), time(2)



sacct(1)