TL;DR: SUS is a very reliable measurement of a system’s perceived usability. It uses 10 questions, graded from strongly disagree to strongly agree, to measure perceived usability and learnability. With a sample size of 12 the results have an accuracy of 100%.
A big amount of effort is attributed to uncovering usability issues of a system. System Usability Scale is a quantitative measure of the usability of the system. Although it doesn’t pinpoint specific attributes [of the system] that affect usability, it is shown to be one of the most powerful ways to put a number on the system’s perceived usability .
Setting up the System Usability Scale (SUS)
The SUS is a questionnaire that can be administered either verbally in a moderated environment, or written in an unmoderated environment. It was initially introduced in 1986 by John Brooke as an undimentional, quick-and-dirty way of measuring usability  and was validated numerous times [3, 4]
The questions of the SUS
The SUS consists of 10 questions, interchanging positive and negative wordings. Following are the original questions with the number 8 modified by Finstad for non-native english speakers 
- I think that I would like to use this system frequently.
- I found the system unnecessarily complex.
- I thought the system was easy to use.
- I think that I would need the support of a technical person to be able to use this system.
- I found the various functions in this system were well integrated.
- I thought there was too much inconsistency in this system.
- I would imagine that most people would learn to use this system very quickly.
- I found the system very cumbersome/awkward to use.
- I felt very confident using the system.
- I needed to learn a lot of things before I could get going with this system.
The responses of the SUS
Each question is supposed to be answered on a scale from 1 (Strongly Disagree) to 5 (Strongly Agree) . Participants shouldn’t dwell into any specific questions, but instead respond with their first thoughts. If the test is administered verbally and the participant doesn’t answer a specific question, it is scored as a 3.
Calculating the score of SUS
- Each question’s score ranges between 0 and 4
- Positively worded questions are 1, 3, 5, 7 and 9
- Negatively worded questions are 2, 4, 6, 8 and 10
Scoring each question:
- For positively worded questions take the number of the answer and subtract 1 (e.g. if the answer of the question one was neutral — 3, then the score of the question one is 3 – 1 = 2)
- For negatively worded questions take the number of the answer and subtract it from 5 (e.g. if the answer of he question two was agree — 4, then the score of the question two is 5 – 4 = 1)
After that, the overall score of the SUS is calculated as follows:
- Add all 10 questions’ scores
- Multiply the number by 2.5
- You will get a number between 0 and 100, which is the SUS score
Extracting learnability measurements
Although there was some evidence by James R. Lewis and Jeff Sauro that SUS measures two dimensions of one’s system: Learnability and Usability , this is not considered the case in most studies. Instead Lewis and Sauro reviewed 9000 questionnaires to find a factorisation of negatively vs positively worded questions rather Usability and Learnability . However, there is some connection between the questions 4 and 10 that in some cases may be strong enough to form a separate factor .
If it is proven than in the specific field of the study (most of the time not the case), questions 4 and 10 form a separate factor (specifically measure learnability), the SUS’s learnability score can be extracted as following:
- Add the scores of questions 4 and 10
- Multiply the result by 12.5
- You will get a number between 0 and 100, which corresponds to the learnability of the system
Understanding the score of SUS
A score of 0 translates to “it is impossible to perform the required task”, whereas a score of 100 means that the required task is performed easily. However, most systems are evaluated by testing a flow with numerous tasks; thus realistically it is very difficult for a score to be close to 0 . For that reason, it is suggested to use the System Usability Score as a relative measurement to the system tested. A system with higher SUS as another means that is more usable, although how much more usable is not up for the SUS to show.
How many participants should be included in a SUS test
The power of SUS is also shown in the sample size needed to achieve statistical significance. Thomas S. Tullis and Jacqueline N. Stetson showed that from 12 participants and more the SUS result remains the same .
Can you change the word “system” in the SUS questions?
SUS is quite resilient to changes in the name of the “system”. Bangor et al. changed the word “system” with “product” and didn’t see a significant change in the results . This suggests that probably other, more specific, synonyms of “system” could be used safely.
Internationalised SUS vs non-native english speakers
The SUS is tested with non-native english speakers and its reliability stays the same. However several researchers have translated and validated SUS to local languages (e.g. Greek version by Katsanos et al. ). If the participants are native speakers of a language other than English, it is worth considering using the local version of the SUS.
 Brooke, J. (1996). SUS: A ‘quick and dirty’ usability scale. In P.W. Jordan, B. Thomas, B.A. Weerdmeester, & I.L. McClelland (Eds.) Usability Evaluation in Industry (pp. 189-194). Taylor & Francis: London.