Scalability describes how well a system can use resources to perform work, and how the work capacity increases as resources are added.

A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added.

In Distributed Systems, scalability can also be used to describe how redundancy and fault tolerance impact the overall performance of the system. In general, a service should be highly available with minimal performance loss.

Scalability is usually one of the goals of any good System design, and generally cannot be achieved from a poor design. Achieving good scalability requires knowing the data access patterns, the expected growth rates and the physical limitations of hardware and network.

References

https://www.allthingsdistributed.com/2006/03/a_word_on_scalability.html