Strings
Wide and Wide-Wide Strings
We've seen many source-code examples so far that includes strings. In most of
them, we were using the standard string type: String
. This type is
useful for the common use-case of displaying messages or dealing with
information in plain English. Here, we define "plain English" as the use of the
language that avoids French accents or German umlaut, for example, and doesn't
make use of any characters in non-Latin alphabets.
There are two additional string types in Ada: Wide_String
, and
Wide_Wide_String
. These types are particularly important when dealing
with textual information in non-standard English, or in various other
languages, non-Latin alphabets and special symbols.
These string types use different bit widths for their characters. This becomes more apparent when looking at the type definitions:
type String is
array (Positive range <>) of Character;
type Wide_String is
array (Positive range <>) of Wide_Character;
type Wide_Wide_String is
array (Positive range <>) of
Wide_Wide_Character;
The following table shows the typical bit-width of each character of the string types:
Character Type |
Width |
---|---|
|
8 bits |
|
16 bits |
|
32 bits |
We can see that when running this example:
with Ada.Text_IO; use Ada.Text_IO; procedure Show_Wide_Char_Types is begin Put_Line ("Character'Size: " & Integer'Image (Character'Size)); Put_Line ("Wide_Character'Size: " & Integer'Image (Wide_Character'Size)); Put_Line ("Wide_Wide_Character'Size: " & Integer'Image (Wide_Wide_Character'Size)); end Show_Wide_Char_Types;
Let's look at another example, this time using wide strings:
with Ada.Text_IO; with Ada.Wide_Text_IO; with Ada.Wide_Wide_Text_IO; procedure Show_Wide_String_Types is package TI renames Ada.Text_IO; package WTI renames Ada.Wide_Text_IO; package WWTI renames Ada.Wide_Wide_Text_IO; S : constant String := "hello"; WS : constant Wide_String := "hello"; WWS : constant Wide_Wide_String := "hello"; begin TI.Put_Line ("String: " & S); TI.Put_Line ("Length: " & Integer'Image (S'Length)); TI.Put_Line ("Size: " & Integer'Image (S'Size)); TI.Put_Line ("Component_Size: " & Integer'Image (S'Component_Size)); TI.Put_Line ("------------------------"); WTI.Put_Line ("Wide string: " & WS); TI.Put_Line ("Length: " & Integer'Image (WS'Length)); TI.Put_Line ("Size: " & Integer'Image (WS'Size)); TI.Put_Line ("Component_Size: " & Integer'Image (WS'Component_Size)); TI.Put_Line ("------------------------"); WWTI.Put_Line ("Wide-wide string: " & WWS); TI.Put_Line ("Length: " & Integer'Image (WWS'Length)); TI.Put_Line ("Size: " & Integer'Image (WWS'Size)); TI.Put_Line ("Component_Size: " & Integer'Image (WWS'Component_Size)); TI.Put_Line ("------------------------"); end Show_Wide_String_Types;
Here, all strings (S
, WS
and WWS
) have the same length of
5 characters. However, the size of each character is different — thus,
each string has a different overall size.
The recommendation is to use the String
type when the textual
information you're processing is in standard English. In case any kind of
internationalization is needed, using Wide_Wide_String
is probably the
best choice, as it covers all possible use-cases.
In the Ada Reference Manual
Text I/O
Note that, in the previous example, we were using different versions of the
Ada.Text_IO
package depending on the string type we were using:
Ada.Text_IO
for objects ofString
type,Ada.Wide_Text_IO
for objects ofWide_String
type,Ada.Wide_Wide_Text_IO
for objects ofWide_Wide_String
type.
In that example, we were also using package renaming to differentiate among those packages.
Similarly, there are different versions of text I/O packages for individual
types. For example, if we want to display the value of a Long_Integer
variable based on the Wide_Wide_String
type, we can select the
Ada.Long_Integer_Wide_Wide_Text_IO
package. In fact, the list of
packages resulting from the combination of those types is quite long:
Scalar Type |
Text I/O Packages |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Also, there are different versions of the generic packages Integer_IO
and Float_IO
:
Scalar Type |
Text I/O Packages |
---|---|
Integer types |
|
Real types |
|
Wide and Wide-Wide String Handling
As we've just seen, we have different versions of the Ada.Text_IO
package. The same applies to string handling packages. As we've seen in the
Introduction to Ada course,
we can use the Ada.Strings.Fixed
and Ada.Strings.Maps
packages
for string handling. For other formats, we have these packages:
Ada.Strings.Wide_Fixed
,Ada.Strings.Wide_Wide_Fixed
,Ada.Strings.Wide_Maps
,Ada.Strings.Wide_Wide_Maps
.
Let's look at this example from the Introduction to Ada course, which we adapted for wide-wide strings:
with Ada.Strings; use Ada.Strings; with Ada.Strings.Wide_Wide_Fixed; use Ada.Strings.Wide_Wide_Fixed; with Ada.Strings.Wide_Wide_Maps; use Ada.Strings.Wide_Wide_Maps; with Ada.Wide_Wide_Text_IO; use Ada.Wide_Wide_Text_IO; procedure Show_Find_Words is S : constant Wide_Wide_String := "Hello" & 3 * " World"; F : Positive; L : Natural; I : Natural := 1; Whitespace : constant Wide_Wide_Character_Set := To_Set (' '); begin Put_Line ("String: " & S); Put_Line ("String length: " & Integer'Wide_Wide_Image (S'Length)); while I in S'Range loop Find_Token (Source => S, Set => Whitespace, From => I, Test => Outside, First => F, Last => L); exit when L = 0; Put_Line ("Found word instance at position " & F'Wide_Wide_Image & ": '" & S (F .. L) & "'"); I := L + 1; end loop; end Show_Find_Words;
In this example, we're using the Find_Token
procedure to find the words
from the phrase stored in the S
constant. All the operations we're using
here are similar to the ones for String
type, but making use of the
Wide_Wide_String
type instead. (We talk about the Wide_Wide_Image
attribute later on.)
In the Ada Reference Manual
Bounded and Unbounded Wide and Wide-Wide Strings
We've seen in the Introduction to Ada course
that other kinds of String
types are available. For example, we can
use bounded and
unbounded strings — those correspond
to the Bounded_String
and Unbounded_String
types.
Those kinds of string types are available for Wide_String
, and
Wide_Wide_String
. The following table shows the available types and
corresponding packages:
Type |
Package |
---|---|
|
|
|
|
|
|
|
|
The same applies to text I/O for those strings. For the standard case, we have
Ada.Text_IO.Bounded_IO
for the Bounded_String
type and
Ada.Text_IO.Unbounded_IO
for the Unbounded_String
type.
For wider string types, we have:
Type |
Text I/O Package |
---|---|
|
|
|
|
|
|
|
|
Let's look at a simple example:
with Ada.Strings.Wide_Wide_Unbounded; use Ada.Strings.Wide_Wide_Unbounded; with Ada.Wide_Wide_Text_IO.Wide_Wide_Unbounded_IO; use Ada.Wide_Wide_Text_IO.Wide_Wide_Unbounded_IO; procedure Show_Unbounded_Wide_Wide_String is S : Unbounded_Wide_Wide_String := To_Unbounded_Wide_Wide_String ("Hello"); begin S := S & Wide_Wide_String'(" hello"); Put_Line ("Unbounded wide-wide string: " & S); end Show_Unbounded_Wide_Wide_String;
In this example, we're declaring a variable S
and initializing it with
the word "Hello." Then, we're concatenating it with " hello" and displaying it.
All the operations we're using here are similar to the ones for
Unbounded_String
type, but they've been adapted for the
Unbounded_Wide_Wide_String
type.
In the Ada Reference Manual
String Encoding
Unicode is one of the most widespread standards for encoding writing systems other than the Latin alphabet. It defines a format called Unicode Transformation Format (UTF) in various versions, which vary according to the underlying precision, support for backwards-compatibility and other requirements.
In the Ada Reference Manual
UTF-8 encoding and decoding
A common UTF format is UTF-8, which encodes strings using up to four (8-bit) bytes and is backwards-compatible with the ASCII format. While encoding of ASCII characters requires only one byte, Chinese characters require three bytes, for example.
In Ada applications, UTF-8 strings are indicated by using the
UTF_8_String
from the Ada.Strings.UTF_Encoding
package.
In order to encode from and to UTF-8 strings, we can use the Encode
and Decode
functions. Those functions are specified in the child
packages of the Ada.Strings.UTF_Encoding package. We select the appropriate
child package depending on the string type we're using, as you can see in the
following table:
Child Package of
|
Convert from / to |
---|---|
|
|
|
|
|
|
Let's look at an example:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; with Ada.Strings.UTF_Encoding.Wide_Wide_Strings; use Ada.Strings.UTF_Encoding.Wide_Wide_Strings; with Ada.Strings.Wide_Wide_Unbounded; use Ada.Strings.Wide_Wide_Unbounded; procedure Show_WW_UTF_String is function To_UWWS (Source : Wide_Wide_String) return Unbounded_Wide_Wide_String renames To_Unbounded_Wide_Wide_String; function To_WWS (Source : Unbounded_Wide_Wide_String) return Wide_Wide_String renames To_Wide_Wide_String; Hello_World_Arabic : constant UTF_8_String := "مرحبا يا عالم"; WWS_Hello_World_Arabic : constant Wide_Wide_String := Decode (Hello_World_Arabic); UWWS : Unbounded_Wide_Wide_String; begin UWWS := "Hello World: " & To_UWWS (WWS_Hello_World_Arabic); Show_WW_String : declare WWS : constant Wide_Wide_String := To_WWS (UWWS); begin Put_Line ("Wide_Wide_String Length: " & WWS'Length'Image); Put_Line ("Wide_Wide_String Size: " & WWS'Size'Image); end Show_WW_String; Put_Line ("---------------------------------------"); Put_Line ("Converting Wide_Wide_String to UTF-8..."); Show_UTF_8_String : declare S_UTF_8 : constant UTF_8_String := Encode (To_WWS (UWWS)); begin Put_Line ("UTF-8 String: " & S_UTF_8); Put_Line ("UTF-8 String Length: " & S_UTF_8'Length'Image); Put_Line ("UTF-8 String Size: " & S_UTF_8'Size'Image); end Show_UTF_8_String; end Show_WW_UTF_String;
In this application, we start by storing a string in Arabic in the
Hello_World_Arabic
constant. We then use the Decode
function to
convert that string from UTF_8_String
type to Wide_Wide_String
type — we store it in the WWS_Hello_World_Arabic
constant.
We use a variable of type Unbounded_Wide_Wide_String
(UWWS
) to
manipulate strings: we append the string in Arabic to the "Hello World: "
string and store it in UWWS
.
In the Show_WW_String
block, we convert the string — stored in
UWWS
— from the Unbounded_Wide_Wide_String
type to the
Wide_Wide_String
type and display the length and size of the string. We
do something similar in the Show_UTF_8_String
block, but there, we
convert to the UTF_8_String
type.
Also, in the Show_UTF_8_String
block, we use the Encode
function
to convert that string from Wide_Wide_String
type to then
UTF_8_String
type — we store it in the S_UTF_8
constant.
UTF-8 size and length
As you can see when running the last code example from the previous subsection, we have different sizes and lengths depending on the string type:
String type |
Size |
Length |
---|---|---|
|
832 |
26 |
|
296 |
37 |
The size needed for storing the string when using the Wide_Wide_String
type is bigger than the one when using the UTF_8_String
type. This is
expected, as the Wide_Wide_String
uses 32-bit characters, while the
UTF_8_String
type uses 8-bit codes to store the string in a more
efficient way (memory-wise).
The length of the string using the Wide_Wide_String
type is equivalent
to the number of symbols we have in the original string: 26 characters /
symbols. When using UTF-8, however, we may need more 8-bit codes to
represent one symbol from the original string, so we may end up with a length
value that is bigger than the actual number of symbols from the original string
— as it is the case in this source-code example.
This difference in sizes might not always be the case. In fact, the sizes match when encoding a symbol in UTF-8 that requires four 8-bit codes. For example:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; with Ada.Strings.UTF_Encoding.Wide_Wide_Strings; use Ada.Strings.UTF_Encoding.Wide_Wide_Strings; procedure Show_UTF_8 is Symbol_UTF_8 : constant UTF_8_String := "𝚡"; Symbol_WWS : constant Wide_Wide_String := Decode (Symbol_UTF_8); begin Put_Line ("Wide_Wide_String Length: " & Symbol_WWS'Length'Image); Put_Line ("Wide_Wide_String Size: " & Symbol_WWS'Size'Image); Put_Line ("UTF-8 String Length: " & Symbol_UTF_8'Length'Image); Put_Line ("UTF-8 String Size: " & Symbol_UTF_8'Size'Image); New_Line; Put_Line ("UTF-8 String: " & Symbol_UTF_8); end Show_UTF_8;
In this case, both strings — using the Wide_Wide_String
type or
the UTF_8_String
type — have the same size: 32 bits. (Here, we're
using the 𝚡
symbol from the
Mathematical Alphanumeric Symbols block,
not the standard "x" from the
Basic Latin block.)
UTF-8 encoding in source-code files
In the past, it was common to use different character sets in text files when writing in different (human) languages. By default, Ada source-code files are expected to use the Latin-1 coding, which is a 8-bit character set.
Nowadays, however, using UTF-8 coding for text files — including source-code files — is very common. If your Ada code only uses standard ASCII characters, but you're saving it in a UTF-8 coded file, there's no need to worry about character sets, as UTF-8 is backwards compatible with ASCII.
However, you might want to use Unicode symbols in your Ada source code to declare constants — as we did in the previous sections — and store the source code in a UTF-8 coded file. In this case, you need be careful about how this file is parsed by the compiler.
Let's look at this source-code example:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; procedure Show_UTF_8_Strings is Symbols_UTF_8 : constant UTF_8_String := "♥♫"; begin Put_Line ("UTF_8_String: " & Symbols_UTF_8); Put_Line ("Length: " & Symbols_UTF_8'Length'Image); end Show_UTF_8_Strings;
Here, we're using Unicode symbols to initialize the Symbols_UTF_8
constant of UTF_8_String
type.
Now, let's assume this source-code example is stored in a UTF-8 coded file.
Because the "♥♫"
string makes use of non-ASCII Unicode symbols,
representing this string in UTF-8 format will require more than 2 bytes.
In fact, each one of those Unicode symbols requires 2 bytes to be encoded in
UTF-8. (Keep in mind that Unicode symbols may require
between 1 to 4 bytes to be encoded in UTF-8 format.) Also,
in this case, the UTF-8 encoding process is using two additional bytes.
Therefore, the total length of the string is six, which matches what we see
when running the Show_UTF_8_Strings
procedure. In other words, the
length of the Symbols_UTF_8
string doesn't refer to those two characters
("♥♫"
) that we were using in the constant declaration, but the length of
the encoded bytes in its UTF-8 representation.
The UTF-8 format is very useful for storing and transmitting texts. However, if
we want to process Unicode symbols, it's probably better to use string types
with 32-bit characters — such as Wide_Wide_String
. For example,
let's say we want to use the "♥♫"
string again to initialize a constant
of Wide_Wide_String
type:
with Ada.Text_IO; with Ada.Wide_Wide_Text_IO; procedure Show_WWS_Strings is package TIO renames Ada.Text_IO; package WWTIO renames Ada.Wide_Wide_Text_IO; Symbols_WWS : constant Wide_Wide_String := "♥♫"; begin WWTIO.Put_Line ("Wide_Wide_String: " & Symbols_WWS); TIO.Put_Line ("Length: " & Symbols_WWS'Length'Image); end Show_WWS_Strings;---- run info:
In this case, as mentioned above, if we store this source code in a text file using UTF-8 format, we need to ensure that the UTF-8 coded symbols are correctly interpreted by the compiler when it parses the text file. Otherwise, we might get unexpected behavior. (Interpreting the characters in UTF-8 format as Latin-1 format is certainly an example of what we want to avoid here.)
In the GNAT toolchain
You can use UTF-8 coding in your source-code file and initialize strings of
32-bit characters. However, as we just mentioned, you need to make sure
that the UTF-8 coded symbols are correctly interpreted by the compiler when
dealing with types such as Wide_Wide_String
. For this case, GNAT
offers the -gnatW8
switch. Let's run the previous example using this
switch:
with Ada.Text_IO; with Ada.Wide_Wide_Text_IO; procedure Show_WWS_Strings is package TIO renames Ada.Text_IO; package WWTIO renames Ada.Wide_Wide_Text_IO; Symbols_WWS : constant Wide_Wide_String := "♥♫"; begin WWTIO.Put_Line ("Wide_Wide_String: " & Symbols_WWS); TIO.Put_Line ("Length: " & Symbols_WWS'Length'Image); end Show_WWS_Strings;
Because the Wide_Wide_String
type has 32-bit characters. we expect
the length of the string to match the number of symbols that we're using.
Indeed, when running the Show_WWS_Strings
procedure, we see that
the Symbols_WWS
string has a length of two characters, which matches
the number of characters of the "♥♫"
string.
When we use the -gnatW8
switch, GNAT converts the UTF-8-coded string
("♥♫"
) to UTF-32 format, so we get two 32-bit characters. It then
uses the UTF-32-coded string to initialize the Symbols_WWS
string.
If we don't use the -gnatW8
switch, however, we get wrong results.
Let's look at the same example again without the switch:
with Ada.Text_IO; with Ada.Wide_Wide_Text_IO; procedure Show_WWS_Strings is package TIO renames Ada.Text_IO; package WWTIO renames Ada.Wide_Wide_Text_IO; Symbols_WWS : constant Wide_Wide_String := "♥♫"; begin WWTIO.Put_Line ("Wide_Wide_String: " & Symbols_WWS); TIO.Put_Line ("Length: " & Symbols_WWS'Length'Image); end Show_WWS_Strings;
Now, the "♥♫"
string is being interpreted as a string of six 8-bit
characters. (In other words, the UTF-8-coded string isn't converted to
the UTF-32 format.) Each of those 8-bit characters is then stored in a
32-bit character of the Wide_Wide_String
type. This explains why
the Show_WWS_Strings
procedure reports a length of 6 components for
the Symbols_WWS
string.
Portability of UTF-8 in source-code files
In a previous code example, we were assuming that the format that we use for the source-code file is UTF-8. This allows us to simply use Unicode symbols directly in strings:
Symbol_UTF_8 : constant UTF_8_String := "★";
This approach, however, might not be portable. For example, if the compiler uses a different string encoding for source-code files, it might interpret that Unicode character as something else — or just throw a compilation error.
If you're afraid that format mismatches might happen in your compilation
environment, you may want to write strings in your code in a completely
portable fashion, which consists in entering the exact sequence of codes in
bytes — using the Character'Val
function — for the symbols
you want to use.
We can reuse parts of the previous example and replace the UTF-8 character with the corresponding UTF-8 code:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; procedure Show_UTF_8 is Symbol_UTF_8 : constant UTF_8_String := Character'Val (16#e2#) & Character'Val (16#98#) & Character'Val (16#85#); begin Put_Line ("UTF-8 String: " & Symbol_UTF_8); end Show_UTF_8;
Here, we use a sequence of three calls to the Character'Val(code)
function for the UTF-8 code that corresponds to the "★" symbol.
UTF-16 encoding and decoding
So far, we've discussed the UTF-8 encoding scheme. However, other encoding
schemes exist and are supported as well. In fact, the
Ada.Strings.UTF_Encoding
package defines three encoding schemes:
type Encoding_Scheme is (UTF_8,
UTF_16BE,
UTF_16LE);
For example, instead of using UTF-8 encoding, we can use UTF-16 encoding
— either in the big-endian or in the little-endian version.
To convert between UTF-8 and UTF-16 encoding schemes, we can make use of the
conversion functions from the Ada.Strings.UTF_Encoding.Conversions
package.
To declare a UTF-16 encoded string, we can use one of the following data types:
the 8-bit-character based
UTF_String
type, orthe 16-bit-character based
UTF_16_Wide_String
type.
When using the 8-bit version, though, we have to specify the input and output schemes when converting between UTF-8 and UTF-16 encoding schemes.
Let's see a code example that makes use of both UTF_String
and
UTF_16_Wide_String
types:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; with Ada.Strings.UTF_Encoding.Conversions; use Ada.Strings.UTF_Encoding.Conversions; procedure Show_UTF16_Types is Symbols_UTF_8 : constant UTF_8_String := "♥♫"; Symbols_UTF_16 : constant UTF_16_Wide_String := Convert (Symbols_UTF_8); -- ^ Calling Convert for UTF_8_String -- to UTF_16_Wide_String conversion. Symbols_UTF_16BE : constant UTF_String := Convert (Item => Symbols_UTF_8, Input_Scheme => UTF_8, Output_Scheme => UTF_16BE); -- ^ Calling Convert for UTF_8_String -- to UTF_String conversion in UTF-16BE -- encoding. begin Put_Line ("UTF_8_String: " & Symbols_UTF_8); Put_Line ("UTF_16_Wide_String: " & Convert (Symbols_UTF_16)); -- ^ Calling Convert for -- the UTF_16_Wide_String to -- UTF_8_String conversion. Put_Line ("UTF_String / UTF_16BE: " & Convert (Item => Symbols_UTF_16BE, Input_Scheme => UTF_16BE, Output_Scheme => UTF_8)); end Show_UTF16_Types;
In this example, we're declaring a UTF-8 encoded string and storing it in the
Symbols_UTF_8
constant. Then, we're calling the Convert
functions to convert between UTF-8 and UTF-16 encoding schemes. We're using two
versions of this function:
the
Convert
function that returns an object ofUTF_16_Wide_String
type for an input ofUTF_8_String
type, andthe
Convert
function that returns an object ofUTF_String
type for an input ofUTF_8_String
type.In this case, we need to specify the input and output schemes (see
Input_Scheme
andOutput_Scheme
parameters in the code example).
Previously, we've seen that the
Ada.Strings.UTF_Encoding.Wide_Wide_Strings
package offers functions to
convert between UTF-8 and the Wide_Wide_String
type. The same kind of
conversion functions exist for UTF-16 strings as well. Let's look at this code
example:
with Ada.Text_IO; use Ada.Text_IO; with Ada.Strings.UTF_Encoding; use Ada.Strings.UTF_Encoding; with Ada.Strings.UTF_Encoding.Wide_Wide_Strings; use Ada.Strings.UTF_Encoding.Wide_Wide_Strings; with Ada.Strings.UTF_Encoding.Conversions; use Ada.Strings.UTF_Encoding.Conversions; procedure Show_WW_UTF16_String is Symbols_UTF_16 : constant UTF_16_Wide_String := Wide_Character'Val (16#2665#) & Wide_Character'Val (16#266B#); -- ^ Calling Wide_Character'Val -- to specify the UTF-16 BE code -- for "♥" and "♫". Symbols_WWS : constant Wide_Wide_String := Decode (Symbols_UTF_16); -- ^ Calling Decode for UTF_16_Wide_String -- to Wide_Wide_String conversion. begin Put_Line ("UTF_16_Wide_String: " & Convert (Symbols_UTF_16)); -- ^ Calling Convert for the -- UTF_16_Wide_String to -- UTF_8_String conversion. Put_Line ("Wide_Wide_String: " & Encode (Symbols_WWS)); -- ^ Calling Encode for the -- Wide_Wide_String to -- UTF_8_String conversion. end Show_WW_UTF16_String;
In this example, we're calling the Wide_Character'Val
function to
specify the UTF-16 BE code of the "♥" and "♫" symbols. We're then using
the Decode
function to convert between the UTF_16_Wide_String
and
the Wide_Wide_String
types.
Image attribute
Overview
In the Introduction to Ada course, we've
seen that the Image
attribute returns a string that contains a textual
representation of an object. For example, we write Integer'Image (V)
to
get a string for the integer variable V
:
with Ada.Text_IO; use Ada.Text_IO; procedure Show_Simple_Image is V : Integer; begin V := 10; Put_Line ("V: " & Integer'Image (V)); end Show_Simple_Image;
Naturally, we can use the Image
attribute with other scalar types. For
example:
with Ada.Text_IO; use Ada.Text_IO; procedure Show_Simple_Image is type Status is (Unknown, Off, On); V : Float; S : Status; begin V := 10.0; S := Unknown; Put_Line ("V: " & Float'Image (V)); Put_Line ("S: " & Status'Image (S)); end Show_Simple_Image;
In this example, we retrieve a string representing the floating-point
variable V
. Also, we use Status'Image (V)
to retrieve a string representing the textual version of the Status
.
In the Ada Reference Manual
Type'Image
and Obj'Image
We can also apply the Image
attribute to an object directly:
with Ada.Text_IO; use Ada.Text_IO; procedure Show_Simple_Image is V : Integer; begin V := 10; Put_Line ("V: " & V'Image); -- Equivalent to: -- Put_Line ("V: " & Integer'Image (V)); end Show_Simple_Image;
In this example, the Integer'Image (V)
and V'Image
forms are
equivalent.
Wider versions of Image
Although we've been talking only about the Image
attribute, it's
important to mention that each of the wider versions of the string types also
has a corresponding Image
attribute. In fact, this is the attribute for
each string type:
Attribute |
Type of Returned String |
---|---|
|
|
|
|
|
|
Let's see a simple example:
with Ada.Wide_Wide_Text_IO; use Ada.Wide_Wide_Text_IO; procedure Show_Wide_Wide_Image is F : Float; begin F := 100.0; Put_Line ("F = " & F'Wide_Wide_Image); end Show_Wide_Wide_Image;
In this example, we use the Wide_Wide_Image
attribute to retrieve a
string of Wide_Wide_String
type for the floating-point variable
F
.
Image
attribute for non-scalar types
Note
This feature was introduced in Ada 2022.
In the previous code examples, we were using the Image
attribute with
scalar types, but it isn't restricted to those types. In fact, we can also use
this attribute when dealing with non-scalar types. For example:
package Simple_Records is type Rec is limited private; type Rec_Access is access Rec; function Init return Rec; type Null_Rec is null record; private type Rec is limited record F : Float; I : Integer; end record; function Init return Rec is ((F => 10.0, I => 4)); end Simple_Records;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Ada.Unchecked_Deallocation; with Simple_Records; use Simple_Records; procedure Show_Non_Scalar_Image is procedure Free is new Ada.Unchecked_Deallocation (Object => Rec, Name => Rec_Access); R_A : Rec_Access := new Rec'(Init); N_R : Null_Rec := (null record); begin R_A := new Rec'(Init); N_R := (null record); Put_Line ("R_A: " & R_A'Image); Put_Line ("R_A.all: " & R_A.all'Image); Put_Line ("N_R: " & N_R'Image); Free (R_A); Put_Line ("R_A: " & R_A'Image); end Show_Non_Scalar_Image;
In the Show_Non_Scalar_Image
procedure from this example, we display the
access value of R_A
and the contents of the dereferenced access object
(R_A.all
). Also, we see the indication that N_R
is a null record
and R_A
is null after the call to Free
.
Historically
Since Ada 2022, the Image
attribute is available for all types.
Prior to this version of the language, it was only available for scalar
types. (For other kind of types, programmers had to use the Image
attribute for each component of a record, for example.)
In fact, prior to Ada 2022, the Image
attribute was described in
the 3.5 Scalar Types section of the Ada Reference Manual, as
it was only applied to those types. Now, it is part of the new
Image Attributes section.
Let's see another example, this time with arrays:
pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; procedure Show_Array_Image is type Float_Array is array (Positive range <>) of Float; FA_3C : Float_Array (1 .. 3); FA_Null : Float_Array (1 .. 0); begin FA_3C := [1.0, 3.0, 2.0]; FA_Null := []; Put_Line ("FA_3C: " & FA_3C'Image); Put_Line ("FA_Null: " & FA_Null'Image); end Show_Array_Image;
In this example, we display the values of the three components of the
FA_3C
array. Also, we display the null array FA_Null
.
Image
attribute for tagged types
In addition to untagged types, we can also use the Image
attribute with
tagged types. For example:
package Simple_Records is type Rec is tagged limited private; function Init return Rec; type Rec_Child is new Rec with private; overriding function Init return Rec_Child; private type Status is (Unknown, Off, On); type Rec is tagged limited record F : Float; I : Integer; end record; function Init return Rec is ((F => 10.0, I => 4)); type Rec_Child is new Rec with record Z : Status; end record; function Init return Rec_Child is (Rec'(Init) with Z => Off); end Simple_Records;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Simple_Records; use Simple_Records; procedure Show_Tagged_Image is R : constant Rec := Init; R_Class : constant Rec'Class := Rec'(Init); R_C : constant Rec_Child := Init; begin Put_Line ("R: " & R'Image); Put_Line ("R_Class: " & R_Class'Image); Put_Line ("R_A: " & R_C'Image); end Show_Tagged_Image;
In the Show_Tagged_Image
procedure from this example, we display the
contents of the R
object of Rec
type and the R_Class
object of Rec'Class
type. Also, we display the contents of the
R_C
object of the Rec_Child
type, which is derived from the
Rec
type.
Image
attribute for task and protected types
We can also apply the Image
attribute to protected objects and tasks:
package Simple_Tasking is protected type Protected_Float (I : Integer) is private V : Float := Float (I); end Protected_Float; protected type Protected_Null is private end Protected_Null; task type T is entry Start; end T; end Simple_Tasking;package body Simple_Tasking is protected body Protected_Float is end Protected_Float; protected body Protected_Null is end Protected_Null; task body T is begin accept Start; end T; end Simple_Tasking;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Simple_Tasking; use Simple_Tasking; procedure Show_Protected_Task_Image is PF : Protected_Float (0); PN : Protected_Null; T1 : T; begin Put_Line ("PF: " & PF'Image); Put_Line ("PN: " & PN'Image); Put_Line ("T1: " & T1'Image); T1.Start; end Show_Protected_Task_Image;
In this example, we display information about the protected object PF
,
the componentless protected object PN
and the task T1
.
Put_Image
aspect
Note
This feature was introduced in Ada 2022.
Overview
In the previous section, we discussed many details about the Image
attribute. In the code examples from that section, we've seen the default
behavior of this attribute: the string returned by the calls to Image
was always in the format defined by the Ada standard.
In some situations, however, we might want to customize the string that is
returned by the Image
attribute of a type T
. Ada allows us to do
that via the Put_Image
aspect. This is what we have to do:
Specify the
Put_Image
aspect for the typeT
and indicate a procedure with a specific parameter profile — let's say, for example, a procedure namedP
.Implement the procedure
P
and write the information we want to use into a buffer (by calling the routines defined forRoot_Buffer_Type
, such as thePut
procedure).
We can see these steps performed in the code example below:
pragma Ada_2022; with Ada.Strings.Text_Buffers; package Show_Put_Image is type T is null record with Put_Image => Put_Image_T; -- ^ Custom version of Put_Image use Ada.Strings.Text_Buffers; procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T); end Show_Put_Image;package body Show_Put_Image is procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T) is pragma Unreferenced (Arg); begin -- Call Put with customized -- information Buffer.Put ("<custom info>"); end Put_Image_T; end Show_Put_Image;
In the Show_Put_Image
package, we use the Put_Image
aspect in
the declaration of the T
type. There, we indicate that the
Image
attribute shall use the Put_Image_T
procedure instead
of the default version.
In the body of the Put_Image_T
procedure, we implement our custom
version of the Image
attribute. We do that by calling the
Put
procedure with the information we want to provide in the
Image
attribute. Here, we access a buffer of Root_Buffer_Type
type, which is defined in the Ada.Strings.Text_Buffers
package. (We
discuss more about this package
later on.)
In the Ada Reference Manual
Complete Example of Put_Image
Let's see a complete example in which we use the Put_Image
aspect and
write useful information to the buffer:
pragma Ada_2022; with Ada.Strings.Text_Buffers; package Custom_Numerics is type Float_Integer is record F : Float := 0.0; I : Integer := 0; end record with Dynamic_Predicate => Integer (Float_Integer.F) = Float_Integer.I, Put_Image => Put_Float_Integer; -- ^ Custom version of Put_Image use Ada.Strings.Text_Buffers; procedure Put_Float_Integer (Buffer : in out Root_Buffer_Type'Class; Arg : Float_Integer); end Custom_Numerics;package body Custom_Numerics is procedure Put_Float_Integer (Buffer : in out Root_Buffer_Type'Class; Arg : Float_Integer) is begin -- Call Wide_Wide_Put with customized -- information Buffer.Wide_Wide_Put ("(F : " & Arg.F'Wide_Wide_Image & ", " & "I : " & Arg.I'Wide_Wide_Image & ")"); end Put_Float_Integer; end Custom_Numerics;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Custom_Numerics; use Custom_Numerics; procedure Show_Put_Image is V : Float_Integer; begin V := (F => 100.2, I => 100); Put_Line ("V = " & V'Image); end Show_Put_Image;
In the Custom_Numerics
package of this example, we specify the
Put_Image
aspect and indicate the Put_Float_Integer
procedure.
In that procedure, we display the information of components F
and
I
. Then, in the Show_Put_Image
procedure, we use the Image
attribute for the V
variable and see the information in the exact format
we specified. (If you like to see the default version of the
Put_Image
instead, you may comment out the Put_Image
aspect part
in the declaration of Float_Integer
.)
Relation to the Image
attribute
Note that we cannot override the Image
attribute directly —
there's no Image
aspect that we could specify. However, as we've just
seen, we can do this indirectly by using our own version of the
Put_Image
procedure for a type T
.
The Image
attribute of a type T
makes use of the procedure
indicated in the Put_Image
aspect. Let's say we have the following
declaration:
type T is null record
with Put_Image => Put_Image_T;
When we then use the T'Image
attribute in our code, the custom
Put_Image_T
procedure is automatically called. This is a simplified
example of how the Image
function is implemented:
function Image (V : T)
return String is
Buffer : Custom_Buffer;
-- ^ of Root_Buffer_Type'Class
begin
-- Calling Put_Image procedure
-- for type T
Put_Image_T (Buffer, V);
-- Retrieving the text from the
-- buffer as a string
return Buffer.Get;
end Image;
In other words, the Image
attribute basically:
calls the
Put_Image
procedure specified in thePut_Image
aspect of typeT
's declaration and passes a buffer;
and
retrieves the contents of the buffer as a string and returns it.
If the Put_Image
aspect of type T
isn't specified, the default
version is used. (We've seen the default version of various types
in the previous section about the Image
attribute.)
Put_Image
and derived types
Types that were derived from untagged types (or null extensions) make use of
the Put_Image
procedure that was specified for
their parent type — either a custom procedure indicated in the
Put_Image
aspect or the default one. Naturally, if a derived type
has the Put_Image
aspect, the procedure indicated in the aspect is used
instead. For example:
pragma Ada_2022; with Ada.Strings.Text_Buffers; package Untagged_Put_Image is use Ada.Strings.Text_Buffers; type T is null record with Put_Image => Put_Image_T; procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T); type T_Derived_1 is new T; type T_Derived_2 is new T with Put_Image => Put_Image_T_Derived_2; procedure Put_Image_T_Derived_2 (Buffer : in out Root_Buffer_Type'Class; Arg : T_Derived_2); end Untagged_Put_Image;package body Untagged_Put_Image is procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T) is pragma Unreferenced (Arg); begin Buffer.Wide_Wide_Put ("Put_Image_T"); end Put_Image_T; procedure Put_Image_T_Derived_2 (Buffer : in out Root_Buffer_Type'Class; Arg : T_Derived_2) is pragma Unreferenced (Arg); begin Buffer.Wide_Wide_Put ("Put_Image_T_Derived_2"); end Put_Image_T_Derived_2; end Untagged_Put_Image;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Untagged_Put_Image; use Untagged_Put_Image; procedure Show_Untagged_Put_Image is Obj_T : T; Obj_T_Derived_1 : T_Derived_1; Obj_T_Derived_2 : T_Derived_2; begin Put_Line ("T'Image : " & Obj_T'Image); Put_Line ("T_Derived_1'Image : " & Obj_T_Derived_1'Image); Put_Line ("T_Derived_2'Image : " & Obj_T_Derived_2'Image); end Show_Untagged_Put_Image;
In this example, we declare the type T
and its derived types
T_Derived_1
and T_Derived_2
. When running this code, we see that:
T_Derived_1
makes use of thePut_Image_T
procedure from its parent.Note that, if we remove the
Put_Image
aspect from the declaration ofT
, the default version of thePut_Image
procedure is used for bothT
andT_Derived_1
types.
T_Derived_2
makes use of thePut_Image_T_Derived_2
procedure, which was indicated in thePut_Image
aspect of that type, instead of its parent's procedure.
Put_Image
and tagged types
Types that are derived from a tagged type may also inherit the Put_Image
aspect. However, there are a couple of small differences in comparison to
untagged types, as we can see in the following example:
pragma Ada_2022; with Ada.Strings.Text_Buffers; package Tagged_Put_Image is use Ada.Strings.Text_Buffers; type T is tagged record I : Integer := 0; end record with Put_Image => Put_Image_T; procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T); type T_Child_1 is new T with record I1 : Integer; end record; type T_Child_2 is new T with null record; type T_Child_3 is new T with record I3 : Integer := 0; end record with Put_Image => Put_Image_T_Child_3; procedure Put_Image_T_Child_3 (Buffer : in out Root_Buffer_Type'Class; Arg : T_Child_3); end Tagged_Put_Image;package body Tagged_Put_Image is procedure Put_Image_T (Buffer : in out Root_Buffer_Type'Class; Arg : T) is pragma Unreferenced (Arg); begin Buffer.Wide_Wide_Put ("Put_Image_T"); end Put_Image_T; procedure Put_Image_T_Child_3 (Buffer : in out Root_Buffer_Type'Class; Arg : T_Child_3) is pragma Unreferenced (Arg); begin Buffer.Wide_Wide_Put ("Put_Image_T_Child_3"); end Put_Image_T_Child_3; end Tagged_Put_Image;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Tagged_Put_Image; use Tagged_Put_Image; procedure Show_Tagged_Put_Image is Obj_T : T; Obj_T_Child_1 : T_Child_1; Obj_T_Child_2 : T_Child_2; Obj_T_Child_3 : T_Child_3; begin Put_Line ("T'Image : " & Obj_T'Image); Put_Line ("--------------------"); Put_Line ("T_Child_1'Image : " & Obj_T_Child_1'Image); Put_Line ("--------------------"); Put_Line ("T_Child_2'Image : " & Obj_T_Child_2'Image); Put_Line ("--------------------"); Put_Line ("T_Child_3'Image : " & Obj_T_Child_3'Image); Put_Line ("--------------------"); Put_Line ("T'Class'Image : " & T'Class (Obj_T_Child_1)'Image); end Show_Tagged_Put_Image;---- run info:
In this example, we declare the type T
and its derived types
T_Child_1
, T_Child_2
and T_Child_3
. When running this
code, we see that:
for both
T_Child_1
andT_Child_2
, the parent'sPut_Image
aspect (thePut_Image_T
procedure) is called and its information is combined with the information from the type extension;The information from the parent's
Put_Image_T
procedure is presented in an aggregate syntax — in this case, this results in(Put_Image_T)
.For the
T_Child_1
type, theI1
component of the type extension is displayed by calling a default version of thePut_Image
procedure for that component —(Put_Image_T with I1 => 0)
is displayed.For the
T_Child_2
type, no additional information is displayed because this type has a null extension.
for the
T_Child_3
type, thePut_Image_T_Child_3
procedure, which was indicated in thePut_Image
aspect of the type, is used.
Finally, class-wide types (such as T'Class
) include additional
information. Here, the tag of the specific derived type is displayed first
— in this case, the tag of the T_Child_1
type — and
then the actual information for the derived type is displayed.
Universal text buffer
In the previous section, we've seen that the
first parameter of the procedure indicated in the Put_Image
aspect has
the Root_Buffer_Type'Class
type, which is defined in the
Ada.Strings.Text_Buffers
package. In this section, we talk more about
this type and additional procedures associated with this type.
Note
This feature was introduced in Ada 2022.
Overview
We use the Root_Buffer_Type'Class
type to implement a universal text
buffer that is used to store and retrieve information about data types. Because
this text buffer isn't associated with specific data types, it is universal
— in the sense that we can really use it for any data type, regardless of
the characteristics of this type.
In theory, we could use Ada's universal text buffer to implement applications
that actually process text in some form — for example, when implementing
a text editor. However, in general, Ada programmers are only expected to make
use of the Root_Buffer_Type'Class
type when implementing a procedure for
the Put_Image
aspect. For this reason, we won't discuss any kind of
type derivation — or any other kind of usages of this type — in
this section. Instead, we'll just focus on additional subprograms from the
Ada.Strings.Text_Buffers
package.
In the Ada Reference Manual
Additional procedures
In the previous section, we used the Put
procedure — and the
related Wide_Put
and Wide_Wide_Put
procedures — from the
Ada.Strings.Text_Buffers
package. In addition to these procedures, the
package also includes:
the
New_Line
procedure, which writes a new line marker to the text buffer;the
Increase_Indent
procedure, which increases the indentation in the text buffer; andthe
Decrease_Indent
procedure, which decreases the indentation in the text buffer.
The Ada.Strings.Text_Buffers
package also includes the
Current_Indent
function, which retrieves the current indentation
counter.
Let's revisit an example from the previous section and use the procedures mentioned above:
pragma Ada_2022; with Ada.Strings.Text_Buffers; package Custom_Numerics is type Float_Integer is record F : Float; I : Integer; end record with Dynamic_Predicate => Integer (Float_Integer.F) = Float_Integer.I, Put_Image => Put_Float_Integer; -- ^ Custom version of Put_Image use Ada.Strings.Text_Buffers; procedure Put_Float_Integer (Buffer : in out Root_Buffer_Type'Class; Arg : Float_Integer); end Custom_Numerics;package body Custom_Numerics is procedure Put_Float_Integer (Buffer : in out Root_Buffer_Type'Class; Arg : Float_Integer) is begin Buffer.Wide_Wide_Put ("("); Buffer.New_Line; Buffer.Increase_Indent; Buffer.Wide_Wide_Put ("F : " & Arg.F'Wide_Wide_Image); Buffer.New_Line; Buffer.Wide_Wide_Put ("I : " & Arg.I'Wide_Wide_Image); Buffer.Decrease_Indent; Buffer.New_Line; Buffer.Wide_Wide_Put (")"); end Put_Float_Integer; end Custom_Numerics;pragma Ada_2022; with Ada.Text_IO; use Ada.Text_IO; with Custom_Numerics; use Custom_Numerics; procedure Show_Put_Image is V : Float_Integer; begin V := (F => 100.2, I => 100); Put_Line ("V = " & V'Image); end Show_Put_Image;
In the body of the Put_Float_Integer
procedure, we're using the
New_Line
, Increase_Indent
and Decrease_Indent
procedures
to improve the format of the string returned by the Float_Integer'Image
attribute. Using these procedures, you can create any kind of output format
for your custom type.