sms Counter
SMS Characters and
the GSM STANDARD
About GSM-7
GSM, or Global System for Mobile Communications, defines how characters are counted in an SMS messages.
It is important to understand this GSM specification in order to write your messages effectively and effitiently and to minimize message concatenation into multiple parts since all SMS carriers and providers charge by the SMS part.
A single SMS message that follows the standard can contain up to 160 characters. To ensure compliance with the GSM 3.38 protocol, all the characters used in the message must belong to the 7-bit default alphabet, which encompasses ASCII characters as well as some accented characters such as u umlaut (ü) and e with grave (è).
If any character outside of the aforementioned set is used in the message, it will be considered a Unicode SMS. Due to its distinct character encoding, a Unicode SMS has a character limit of 70 characters. Additionally, in the event that the message contains more than 160 characters (or 70 characters in the case of Unicode SMS), it will be divided into multiple parts, and the corresponding charges will be applied.
GSM-7 Encoding details
GSM-7 is a widely used character encoding standard that allows for efficient transmission of commonly used letters and symbols in various languages on GSM networks by packing them into 7 bits each. SMS messages are sent in 140 8-bit octets, which means that a GSM-7 encoded SMS message can contain up to 160 characters.
The GSM-7 character encoding standard is specified in the GSM 03.38 standard and is universally supported on GSM networks. In languages where the number of commonly used symbols exceeds 128, the use of GSM-7 encoding is mandatory. However, in such cases, local language support is achieved by using shift tables or switching to (16-bit) UCS-2 encoding.
GSM 03.38 7-bit character set
Hex
|
Decimal
|
Character name
|
Supported character
|
0x00
|
0
|
COMMERCIAL AT
|
@
|
0x01
|
1
|
POUND SIGN
|
£
|
0x02
|
2
|
DOLLAR SIGN
|
$
|
0x03
|
3
|
YEN SIGN
|
¥
|
0x04
|
4
|
LATIN SMALL LETTER E WITH GRAVE
|
è
|
0x05
|
5
|
LATIN SMALL LETTER E WITH ACUTE
|
é
|
0x06
|
6
|
LATIN SMALL LETTER U WITH GRAVE
|
ù
|
0x07
|
7
|
LATIN SMALL LETTER I WITH GRAVE
|
ì
|
0x08
|
8
|
LATIN SMALL LETTER O WITH GRAVE
|
ò
|
0x09
|
9
|
LATIN CAPITAL LETTER C WITH CEDILLA
|
Ç
|
0x0A
|
10
|
LINE FEED
|
|
0x0B
|
11
|
LATIN CAPITAL LETTER O WITH STROKE
|
Ø
|
0x0C
|
12
|
LATIN SMALL LETTER O WITH STROKE
|
ø
|
0x0D
|
13
|
CARRIAGE RETURN
|
|
0x0E
|
14
|
LATIN CAPITAL LETTER A WITH RING ABOVE
|
Å
|
0x0F
|
15
|
LATIN SMALL LETTER A WITH RING ABOVE
|
å
|
0x10
|
16
|
GREEK CAPITAL LETTER DELTA
|
Δ
|
0x11
|
17
|
LOW LINE
|
_
|
0x12
|
18
|
GREEK CAPITAL LETTER PHI
|
Φ
|
0x13
|
19
|
GREEK CAPITAL LETTER GAMMA
|
Γ
|
0x14
|
20
|
GREEK CAPITAL LETTER LAMBDA
|
Λ
|
0x15
|
21
|
GREEK CAPITAL LETTER OMEGA
|
Ω
|
0x16
|
22
|
GREEK CAPITAL LETTER PI
|
Π
|
0x17
|
23
|
GREEK CAPITAL LETTER PSI
|
Ψ
|
0x18
|
24
|
GREEK CAPITAL LETTER SIGMA
|
Σ
|
0x19
|
25
|
GREEK CAPITAL LETTER THETA
|
Θ
|
0x1A
|
26
|
GREEK CAPITAL LETTER XI
|
Ξ
|
0x1B
|
27
|
ESCAPE TO EXTENSION TABLE
|
|
0x1C
|
28
|
LATIN CAPITAL LETTER AE
|
Æ
|
0x1D
|
29
|
LATIN SMALL LETTER AE
|
æ
|
0x1E
|
30
|
LATIN SMALL LETTER SHARP S(German)
|
ß
|
0x1F
|
31
|
LATIN CAPITAL LETTER E WITH ACUTE
|
É
|
0x20
|
32
|
SPACE
|
|
0x21
|
33
|
EXCLAMATION MARK
|
!
|
0x22
|
34
|
QUOTATION MARK
|
“
|
0x23
|
35
|
NUMBER SIGN
|
#
|
0x24
|
36
|
CURRENCY SIGN
|
¤
|
0x25
|
37
|
PERCENT SIGN
|
%
|
0x26
|
38
|
AMPERSAND
|
&
|
0x27
|
39
|
APOSTROPHE
|
‘
|
0x28
|
40
|
LEFT PARENTHESIS
|
(
|
0x29
|
41
|
RIGHT PARENTHESIS
|
)
|
0x2A
|
42
|
ASTERISK
|
*
|
0x2B
|
43
|
PLUS SIGN
|
+
|
0x2C
|
44
|
COMMA
|
,
|
0x2D
|
45
|
HYPHEN-MINUS
|
–
|
0x2E
|
46
|
FULL STOP
|
.
|
0x2F
|
47
|
SOLIDUS(SLASH)
|
/
|
0x30
|
48
|
DIGIT ZERO
|
0
|
0x31
|
49
|
DIGIT ONE
|
1
|
0x32
|
50
|
DIGIT TWO
|
2
|
0x33
|
51
|
DIGIT THREE
|
3
|
0x34
|
52
|
DIGIT FOUR
|
4
|
0x35
|
53
|
DIGIT FIVE
|
5
|
0x36
|
54
|
DIGIT SIX
|
6
|
0x37
|
55
|
DIGIT SEVEN
|
7
|
0x38
|
56
|
DIGIT EIGHT
|
8
|
0x39
|
57
|
DIGIT NINE
|
9
|
0x3A
|
58
|
COLON
|
:
|
0x3B
|
59
|
SEMICOLON
|
;
|
0x3C
|
60
|
LESS-THAN SIGN
|
<
|
0x3D
|
61
|
EQUALS SIGN
|
=
|
0x3E
|
62
|
GREATER-THAN SIGN
|
>
|
0x3F
|
63
|
QUESTION MARK
|
?
|
0x40
|
64
|
INVERTED EXCLAMATION MARK
|
¡
|
0x41
|
65
|
LATIN CAPITAL LETTER A
|
A
|
0x42
|
66
|
LATIN CAPITAL LETTER B
|
B
|
0x43
|
67
|
LATIN CAPITAL LETTER C
|
C
|
0x44
|
68
|
LATIN CAPITAL LETTER D
|
D
|
0x45
|
69
|
LATIN CAPITAL LETTER E
|
E
|
0x46
|
70
|
LATIN CAPITAL LETTER F
|
F
|
0x47
|
71
|
LATIN CAPITAL LETTER G
|
G
|
0x48
|
72
|
LATIN CAPITAL LETTER H
|
H
|
0x49
|
73
|
LATIN CAPITAL LETTER I
|
I
|
0x4A
|
74
|
LATIN CAPITAL LETTER J
|
J
|
0x4B
|
75
|
LATIN CAPITAL LETTER K
|
K
|
0x4C
|
76
|
LATIN CAPITAL LETTER L
|
L
|
0x4D
|
77
|
LATIN CAPITAL LETTER M
|
M
|
0x4E
|
78
|
LATIN CAPITAL LETTER N
|
N
|
0x4F
|
79
|
LATIN CAPITAL LETTER O
|
O
|
0x50
|
80
|
LATIN CAPITAL LETTER P
|
P
|
0x51
|
81
|
LATIN CAPITAL LETTER Q
|
Q
|
0x52
|
82
|
LATIN CAPITAL LETTER R
|
R
|
0x53
|
83
|
LATIN CAPITAL LETTER S
|
S
|
0x54
|
84
|
LATIN CAPITAL LETTER T
|
T
|
0x55
|
85
|
LATIN CAPITAL LETTER U
|
U
|
0x56
|
86
|
LATIN CAPITAL LETTER V
|
V
|
0x57
|
87
|
LATIN CAPITAL LETTER W
|
W
|
0x58
|
88
|
LATIN CAPITAL LETTER X
|
X
|
0x59
|
89
|
LATIN CAPITAL LETTER Y
|
Y
|
0x5A
|
90
|
LATIN CAPITAL LETTER Z
|
Z
|
0x5B
|
91
|
LATIN CAPITAL LETTER A WITH DIAERESIS
|
Ä
|
0x5C
|
92
|
LATIN CAPITAL LETTER O WITH DIAERESIS
|
Ö
|
0x5D
|
93
|
LATIN CAPITAL LETTER N WITH TILDE
|
Ñ
|
0x5E
|
94
|
LATIN CAPITAL LETTER U WITH DIAERESIS
|
Ü
|
0x5F
|
95
|
SECTION SIGN
|
§
|
0x60
|
96
|
INVERTED QUESTION MARK
|
¿
|
0x61
|
97
|
LATIN SMALL LETTER A
|
a
|
0x62
|
98
|
LATIN SMALL LETTER B
|
b
|
0x63
|
99
|
LATIN SMALL LETTER C
|
c
|
0x64
|
100
|
LATIN SMALL LETTER D
|
d
|
0x65
|
101
|
LATIN SMALL LETTER E
|
e
|
0x66
|
102
|
LATIN SMALL LETTER F
|
f
|
0x67
|
103
|
LATIN SMALL LETTER G
|
g
|
0x68
|
104
|
LATIN SMALL LETTER H
|
h
|
0x69
|
105
|
LATIN SMALL LETTER I
|
i
|
0x6A
|
106
|
LATIN SMALL LETTER J
|
j
|
0x6B
|
107
|
LATIN SMALL LETTER K
|
k
|
0x6C
|
108
|
LATIN SMALL LETTER L
|
l
|
0x6D
|
109
|
LATIN SMALL LETTER M
|
m
|
0x6E
|
110
|
LATIN SMALL LETTER N
|
n
|
0x6F
|
111
|
LATIN SMALL LETTER O
|
o
|
0x70
|
112
|
LATIN SMALL LETTER P
|
p
|
0x71
|
113
|
LATIN SMALL LETTER Q
|
q
|
0x72
|
114
|
LATIN SMALL LETTER R
|
r
|
0x73
|
115
|
LATIN SMALL LETTER S
|
s
|
0x74
|
116
|
LATIN SMALL LETTER T
|
t
|
0x75
|
117
|
LATIN SMALL LETTER U
|
u
|
0x76
|
118
|
LATIN SMALL LETTER V
|
v
|
0x77
|
119
|
LATIN SMALL LETTER W
|
w
|
0x78
|
120
|
LATIN SMALL LETTER X
|
x
|
0x79
|
121
|
LATIN SMALL LETTER Y
|
y
|
0x7A
|
122
|
LATIN SMALL LETTER Z
|
z
|
0x7B
|
123
|
LATIN SMALL LETTER A WITH DIAERESIS
|
ä
|
0x7C
|
124
|
LATIN SMALL LETTER O WITH DIAERESIS
|
ö
|
0x7D
|
125
|
LATIN SMALL LETTER N WITH TILDE
|
ñ
|
0x7E
|
126
|
LATIN SMALL LETTER U WITH DIAERESIS
|
ü
|
0x7F
|
127
|
LATIN SMALL LETTER A WITH GRAVE
|
à
|
The Extended GSM character set
Some characters might not display correctly
This is because of handset limitations.
Hex
|
Decimal
|
Character name
|
Supported character
|
0x1B65
|
27 101
|
EURO SIGN
|
€
|
0x1B0A
|
27 10
|
FORM FEED
|
<FF> |
0x1B3C
|
27 60
|
LEFT SQUARE BRACKET
|
[
|
0x1B2F
|
27 47
|
REVERSE SOLIDUS(BACKSLASH)
|
\
|
0x1B3E
|
27 62
|
RIGHT SQUARE BRACKET
|
]
|
0x1B14
|
27 20
|
CIRCUMFLEX ACCENT
|
^
|
0x1B28
|
27 40
|
LEFT CURLY BRACKET
|
{
|
0x1B40
|
27 64
|
VERTICAL BAR
|
|
|
0x1B29
|
27 41
|
RIGHT CURLY BRACKET
|
}
|
0x1B3D
|
27 61
|
TILDE
|
~
|
Common characters to avoid
There are some familiar characters you should avoid using and instead use their more efficient GSM counterparts.
‘ | ‘ |
` | ‘ |
“ | “ |
” | “ |
~ | – |
¬ | – |
| | I (uppercase i) |