LEN() function, trailing spaces and uniqueness
Recently, I was faced with the fact that I could not add the two values of VARCHAR type, which differ only in trailing spaces, in the column of a composite primary key (SQL Server 2008). Perhaps for some people is obvious, but it seemed strange that, in principle, different values are considered as duplicates. For values of CHAR(n) type, which has a fixed length, all is clear, because short strings padded with spaces to length n. Therefore, input values, which differ only in trailing spaces, are indistinguishable. But trailing space in value of a VARCHAR type is like a normal character. Here's a simple experiment.
The second row will not be inserted into the table, while you receive a message on the primary key constraint violation. Ie second row is considered as a duplicate of the first. May be the fact is that when you insert, a trailing space was cut off. But no, insert another unique string with trailing space and check trailing space existence:
The value in the len2 column shows that a space exists in the data, but, nevertheless, the values of 'John' and 'John ' are duplicates:
It is obvious that the whole matter in the LEN() function, which, as we know, does ignore trailing spaces. I did not find this information in BOL, but apparently, it is this function is used when comparing the values of any string types. I was curious about the behavior of other databases in this case, and I repeated the experiment for MySQL and PostgreSQL. Following results were obtained.
MySQL (version 5.0)
PostgreSQL (version 8.3.6)
As can be seen, MySQL, and PostgreSQL take into account the space in the number of characters, and in the number of bytes used to store the string value. Unlike PostgreSQL, MySQL and SQL Server consider the rows, which differ only in trailing spaces, duplicates. Therefore, PostgreSQL allows you to insert a row in the tables:
Instead of conclusions
I am far from being able to say on the correctness of any implementation and, moreover, argue about which DBMS is better. I believe that you need to know closely the other DBMS you are using in their professional activities. Study the documentation and all subject to the tests rather than relying on intuition and common sense.
Suggested exercises: 131