What is cardinality (database)?
In the context of databases, cardinality refers to the uniqueness of data values contained in a column. High cardinality means that the column contains a large percentage of very unique values. Low cardinality means that the column contains a lot of 'repetitions' in its data area.
It's not common, but cardinality sometimes refers to the relationships between tables as well. The cardinality between tables can be one-to-one, many-to-one, or many-to-many.
High cardinality columns are those with very unique or unusual data values. For example, in a database table that stores bank account numbers, the 'Account Number' column should have a very high cardinality - by definition, every data item in that column should be completely unique.
Normal cardinality columns are those with a somewhat distinct percentage of data values. For example, if a table contains customer information, the Last Name column would have normal cardinality. Not every last name will be unique (for example, there will likely be multiple occurrences of 'Smith'), but by and large the dates are fairly non-repetitive.
Small cardinality columns are those with very few unique values. In a customer table, a column with low cardinality would be the 'Gender' column. This column will likely only have 'M' and 'F' as the range of values to choose from, and any thousands or millions of records in the table can only choose one of those two values for this column.
Cardinality relationships between tables can take the form of one-to-one, one-to-many (the inverse of which is many-to-one), or many-to-many. These terms simply refer to the relationships of data between tables. For example, the relationship between the 'Customers' table and the 'Bank Accounts' table is one-to-many, meaning a customer can have multiple accounts, but an account cannot belong to more than one customer. That is, provided that this bank has never heard of shared accounts!