sms Counter

SMS Characters and

the GSM STANDARD

About GSM-7

GSM, or Global System for Mobile Communications, defines how characters are counted in an SMS messages.

It is important to understand this GSM specification in order to write your messages effectively and effitiently and to minimize message concatenation into multiple parts since all SMS carriers and providers charge by the SMS part.

A single SMS message that follows the standard can contain up to 160 characters. To ensure compliance with the GSM 3.38 protocol, all the characters used in the message must belong to the 7-bit default alphabet, which encompasses ASCII characters as well as some accented characters such as u umlaut (ü) and e with grave (è).

If any character outside of the aforementioned set is used in the message, it will be considered a Unicode SMS. Due to its distinct character encoding, a Unicode SMS has a character limit of 70 characters. Additionally, in the event that the message contains more than 160 characters (or 70 characters in the case of Unicode SMS), it will be divided into multiple parts, and the corresponding charges will be applied.

GSM-7 Encoding details

GSM-7 is a widely used character encoding standard that allows for efficient transmission of commonly used letters and symbols in various languages on GSM networks by packing them into 7 bits each. SMS messages are sent in 140 8-bit octets, which means that a GSM-7 encoded SMS message can contain up to 160 characters.

The GSM-7 character encoding standard is specified in the GSM 03.38 standard and is universally supported on GSM networks. In languages where the number of commonly used symbols exceeds 128, the use of GSM-7 encoding is mandatory. However, in such cases, local language support is achieved by using shift tables or switching to (16-bit) UCS-2 encoding.

GSM 03.38 7-bit character set

Hex
Decimal
Character name
Supported character
0x00
0
COMMERCIAL AT
@
0x01
1
POUND SIGN
£
0x02
2
DOLLAR SIGN
$
0x03
3
YEN SIGN
¥
0x04
4
LATIN SMALL LETTER E WITH GRAVE
è
0x05
5
LATIN SMALL LETTER E WITH ACUTE
é
0x06
6
LATIN SMALL LETTER U WITH GRAVE
ù
0x07
7
LATIN SMALL LETTER I WITH GRAVE
ì
0x08
8
LATIN SMALL LETTER O WITH GRAVE
ò
0x09
9
LATIN CAPITAL LETTER C WITH CEDILLA
Ç
0x0A
10
LINE FEED
 
0x0B
11
LATIN CAPITAL LETTER O WITH STROKE
Ø
0x0C
12
LATIN SMALL LETTER O WITH STROKE
ø
0x0D
13
CARRIAGE RETURN
 
0x0E
14
LATIN CAPITAL LETTER A WITH RING ABOVE
Å
0x0F
15
LATIN SMALL LETTER A WITH RING ABOVE
å
0x10
16
GREEK CAPITAL LETTER DELTA
Δ
0x11
17
LOW LINE
_
0x12
18
GREEK CAPITAL LETTER PHI
Φ
0x13
19
GREEK CAPITAL LETTER GAMMA
Γ
0x14
20
GREEK CAPITAL LETTER LAMBDA
Λ
0x15
21
GREEK CAPITAL LETTER OMEGA
Ω
0x16
22
GREEK CAPITAL LETTER PI
Π
0x17
23
GREEK CAPITAL LETTER PSI
Ψ
0x18
24
GREEK CAPITAL LETTER SIGMA
Σ
0x19
25
GREEK CAPITAL LETTER THETA
Θ
0x1A
26
GREEK CAPITAL LETTER XI
Ξ
0x1B
27
ESCAPE TO EXTENSION TABLE
 
0x1C
28
LATIN CAPITAL LETTER AE
Æ
0x1D
29
LATIN SMALL LETTER AE
æ
0x1E
30
LATIN SMALL LETTER SHARP S(German)
ß
0x1F
31
LATIN CAPITAL LETTER E WITH ACUTE
É
0x20
32
SPACE
 
0x21
33
EXCLAMATION MARK
!
0x22
34
QUOTATION MARK
0x23
35
NUMBER SIGN
#
0x24
36
CURRENCY SIGN
¤
0x25
37
PERCENT SIGN
%
0x26
38
AMPERSAND
&
0x27
39
APOSTROPHE
0x28
40
LEFT PARENTHESIS
(
0x29
41
RIGHT PARENTHESIS
)
0x2A
42
ASTERISK
*
0x2B
43
PLUS SIGN
+
0x2C
44
COMMA
,
0x2D
45
HYPHEN-MINUS
0x2E
46
FULL STOP
.
0x2F
47
SOLIDUS(SLASH)
/
0x30
48
DIGIT ZERO
0
0x31
49
DIGIT ONE
1
0x32
50
DIGIT TWO
2
0x33
51
DIGIT THREE
3
0x34
52
DIGIT FOUR
4
0x35
53
DIGIT FIVE
5
0x36
54
DIGIT SIX
6
0x37
55
DIGIT SEVEN
7
0x38
56
DIGIT EIGHT
8
0x39
57
DIGIT NINE
9
0x3A
58
COLON
:
0x3B
59
SEMICOLON
;
0x3C
60
LESS-THAN SIGN
<
0x3D
61
EQUALS SIGN
=
0x3E
62
GREATER-THAN SIGN
>
0x3F
63
QUESTION MARK
?
0x40
64
INVERTED EXCLAMATION MARK
¡
0x41
65
LATIN CAPITAL LETTER A
A
0x42
66
LATIN CAPITAL LETTER B
B
0x43
67
LATIN CAPITAL LETTER C
C
0x44
68
LATIN CAPITAL LETTER D
D
0x45
69
LATIN CAPITAL LETTER E
E
0x46
70
LATIN CAPITAL LETTER F
F
0x47
71
LATIN CAPITAL LETTER G
G
0x48
72
LATIN CAPITAL LETTER H
H
0x49
73
LATIN CAPITAL LETTER I
I
0x4A
74
LATIN CAPITAL LETTER J
J
0x4B
75
LATIN CAPITAL LETTER K
K
0x4C
76
LATIN CAPITAL LETTER L
L
0x4D
77
LATIN CAPITAL LETTER M
M
0x4E
78
LATIN CAPITAL LETTER N
N
0x4F
79
LATIN CAPITAL LETTER O
O
0x50
80
LATIN CAPITAL LETTER P
P
0x51
81
LATIN CAPITAL LETTER Q
Q
0x52
82
LATIN CAPITAL LETTER R
R
0x53
83
LATIN CAPITAL LETTER S
S
0x54
84
LATIN CAPITAL LETTER T
T
0x55
85
LATIN CAPITAL LETTER U
U
0x56
86
LATIN CAPITAL LETTER V
V
0x57
87
LATIN CAPITAL LETTER W
W
0x58
88
LATIN CAPITAL LETTER X
X
0x59
89
LATIN CAPITAL LETTER Y
Y
0x5A
90
LATIN CAPITAL LETTER Z
Z
0x5B
91
LATIN CAPITAL LETTER A WITH DIAERESIS
Ä
0x5C
92
LATIN CAPITAL LETTER O WITH DIAERESIS
Ö
0x5D
93
LATIN CAPITAL LETTER N WITH TILDE
Ñ
0x5E
94
LATIN CAPITAL LETTER U WITH DIAERESIS
Ü
0x5F
95
SECTION SIGN
§
0x60
96
INVERTED QUESTION MARK
¿
0x61
97
LATIN SMALL LETTER A
a
0x62
98
LATIN SMALL LETTER B
b
0x63
99
LATIN SMALL LETTER C
c
0x64
100
LATIN SMALL LETTER D
d
0x65
101
LATIN SMALL LETTER E
e
0x66
102
LATIN SMALL LETTER F
f
0x67
103
LATIN SMALL LETTER G
g
0x68
104
LATIN SMALL LETTER H
h
0x69
105
LATIN SMALL LETTER I
i
0x6A
106
LATIN SMALL LETTER J
j
0x6B
107
LATIN SMALL LETTER K
k
0x6C
108
LATIN SMALL LETTER L
l
0x6D
109
LATIN SMALL LETTER M
m
0x6E
110
LATIN SMALL LETTER N
n
0x6F
111
LATIN SMALL LETTER O
o
0x70
112
LATIN SMALL LETTER P
p
0x71
113
LATIN SMALL LETTER Q
q
0x72
114
LATIN SMALL LETTER R
r
0x73
115
LATIN SMALL LETTER S
s
0x74
116
LATIN SMALL LETTER T
t
0x75
117
LATIN SMALL LETTER U
u
0x76
118
LATIN SMALL LETTER V
v
0x77
119
LATIN SMALL LETTER W
w
0x78
120
LATIN SMALL LETTER X
x
0x79
121
LATIN SMALL LETTER Y
y
0x7A
122
LATIN SMALL LETTER Z
z
0x7B
123
LATIN SMALL LETTER A WITH DIAERESIS
ä
0x7C
124
LATIN SMALL LETTER O WITH DIAERESIS
ö
0x7D
125
LATIN SMALL LETTER N WITH TILDE
ñ
0x7E
126
LATIN SMALL LETTER U WITH DIAERESIS
ü
0x7F
127
LATIN SMALL LETTER A WITH GRAVE
à
 

The Extended GSM character set

You can send some additional characters using the <ESC> (0x1B) code in the above table, plus an extra character. These additional characters, known as the Extended GSM character set, require two standard GSM characters for each extended GSM character because they use the escape character prefix.

Some characters might not display correctly

This is because of handset limitations.

Hex
Decimal
Character name
Supported character
0x1B65
27 101
EURO SIGN
0x1B0A
27 10
FORM FEED
<FF>
0x1B3C
27 60
LEFT SQUARE BRACKET
[
0x1B2F
27 47
REVERSE SOLIDUS(BACKSLASH)
\
0x1B3E
27 62
RIGHT SQUARE BRACKET
]
0x1B14
27 20
CIRCUMFLEX ACCENT
^
0x1B28
27 40
LEFT CURLY BRACKET
{
0x1B40
27 64
VERTICAL BAR
|
0x1B29
27 41
RIGHT CURLY BRACKET
}
0x1B3D
27 61
TILDE
~

Common characters to avoid

There are some familiar characters you should avoid using and instead use their more efficient GSM counterparts.

Character to avoid GSM equivalent
`
~
¬
| I (uppercase i)