如何将转换为在 Windows 95 中的 TrueType 标志符号索引的 Unicode 字符代码

作者: admin 分类: 屏幕取词发布时间: 2013-04-02 08:37 ė3,398 浏览数 6没有评论

http://support.microsoft.com/kb/241020

文章编号: 241020 – 查看本文应用于的产品

查看机器翻译免责声明

点击这里查看逐句中英文对照机器翻译

系统提示注意：本篇文章的内容适用于其他版本的 Windows (不包括您正在使用的版本) 。本篇文章的内容可能不适用您的电脑。浏览 Windows 8 帮助和支持中心

展开全部 | 关闭全部

概要

应用程序通常通过在一个单词的字符串中指定的字符代码绘制 Unicode 文本。使用 TrueType 字体时，则操作系统将这些字符代码转换为 TrueType 标志符号索引（时标志符号，它绘制）。应用程序可能需要转换到标志符号索引的字符代码。这篇文章的示例源代码，讨论如何从 TrueType 字体文件获取标志符号索引。

回到顶端 | 提供反馈

绘制到设备上的字符在 Windows 中涉及到的字符图形（一个标志符号）的字体取决于索引映射的字符代码。当使用 TrueType 字体时，这些索引被称为标志符号索引。
TrueType 字体文件有一种灵活的、面向表的文件格式。这种文件格式的灵活性支持到各自的标志符号的字符代码各种字符编码（或映射）。一种这种编码方式可以为 Unicode 到标志符号索引。
Microsoft Windows 操作系统中使用 TrueType 字体文件，几乎始终是 Unicode 编码。即使 Windows 95 （和 Windows 3.1 到其前置任务版本）使用 TrueType 字体作为其内部编码使用 Unicode 标准。Unicode 标准和 TrueType 字体文件，请参阅本指南的详细信息的参考部分。
作为两个单独的条目点导出函数的 Win32 应用程序编程接口 (API)。第一个具有基本的函数名称附加的”A”；第二个具有基本的函数名称的附加的”W”。字符组（或 ANSI）的字符串编码这些入口点支持和 Unicode （或宽字符）编码，分别。平台 SDK 头文件定义的”A”或”W”的变量，这取决于使用的 UNICODE 宏，为基本函数名称，但可以直接引用任何一个 variant 类型的值。
Win32 API 中的 Windows 95 或更高版本中的函数中的大多数支持功能的 ANSI 版本。ANSI API，除了是还支持有限子集的 Unicode 支持的功能，如TextOutW、 lstrlenW，等等。在 Windows 95 和其后续任务的 Unicode 支持，请参见引用部分的详细信息。
这一组 Unicode 或宽字符函数支持编写有限的 Unicode 支持的应用程序。正如 ANSI 应用程序可能需要使用标志符号索引，Unicode 支持的应用程序可能需要对它们的访问也。当应用程序使用的 TrueType 字体文件，某些高级的功能，如果应用程序所需的标志符号定义，或者如果它需要实现的解决办法或功能不在操作系统中存在；否则为，就会发生这样的情况。
ANSI 应用程序可以使用GetCharacterPlacementA函数来转换的字符代码，标志符号索引的一个字节的字符串。如果应用程序使用 Unicode 字符串编码，并且它在 Windows NT 上运行，则GetCharacterPlacement函数的宽字符版本。在 Windows 95 中，因此，没有从 Unicode 转换为标志符号索引的 API 未实现GetCharacterPlacementW函数。
若要转换到 TrueType 标志符号索引的 Unicode 字符的代码，应用程序必须检索表格中的标志符号索引 Unicode 编码的数据。可以通过调用GetFontData Win32 函数来获取 TrueType 字体文件中的数据。此函数可返回一个未处理的字节的缓冲区，但它可以提取该缓冲区相对于命名 TrueType 表的开始。该函数也作用于当前实现设备上下文 (DC) 中的 TrueType 字体文件。这些功能使该函数查找字体文件和分析其表目录定位到相应的表比更有用。
标志符号索引 Unicode 编码，位于目录中的 TrueType 字体文件标记为”cmap”，是包含字体文件的字符映射表的标记名称。此表可能包含一个或多个选的子表的不同的映射。
紧跟”cmap”表中的初始的无符号短值是每一种编码 TrueType 字体文件包含的目录。TrueType 规范，Unicode 编码都位于标有 PlatformId 值为 3 和 SpecificId 的值为 1 的子表。该规范还定义了引用此 3-1 编码格式 4 子表子表。用来为 3 PlatformId 和 SpecificId 0 （零）的编码也是格式 4 编码，但字体文件通常解释为符号的字体文件。每个符号的字体的建议 TrueType 规范中，人们预计该编码包含字符代码从 Unicode 专用区。
格式 4 子表是稀疏数组。为了适应所需的 16 位 Unicode 标准的 64k 项，格式 4 子表分成收集相邻的字符序列。段由受该范围的字符的第一个和最后一个字符代码值定义。段的集合由一套并行数组存储在子表中：一个用于范围（startCount），另一个（endCount）的范围的结束字符的起始字符。段数组是一种分类的字符代码的潜在映射，不实际映射。
标志符号索引到的映射需要从子表，称为 idRangeOffset 的第三个并行数组的使用。这决定了这两种方法还可用于计算最终的标志符号索引。第一种方法使用一个简单的增量值来计算从字符代码中的标志符号索引。第二种方法使用到中间的标志符号 Id 表查找。如果此表中的值为零，则没有标志符号；否则，将使用的值到最终的标志符号索引的计算机。查阅表格是一种表示集合跨越一段的不连续的字符代码的有效途径。
余额的这篇文章介绍了如何从 TrueType 字体文件获取 TrueType 标志符号索引。已进行了以下假设：转换应用程序的域控制器当前所选的 true Type 字体、当前字符串编码为 Unicode，和 TrueType 字体文件具有 Unicode 字符映射表（”cmap”格式 4 子表）。
注意：这种技术用于获取标志符号索引将适用于 Windows NT ；但是， GetCharacterPlacement函数的 Unicode 版本已实现，因为它通常是不必要。如果 Windows 95 中应用程序的字符串不是 Unicode，它可以调用GetCharacterPlacementA函数将转换为标志符号索引相反。
本文结尾处的完整源代码。在文章的文本，进行大量的源代码引用。请参阅示例源代码，以检查在其完整的上下文中相关的参照。

可能性和结束

若要使用 TrueType 字体文件中的数据，许多数据类型问题必须得到充分理解。TrueType 规范，在字体文件中的所有表都定义为基本数据类型的集合。规范还定义的基数据类型，但它们很好地与某些 Windows 平台 SDK 所定义的基础数据类型相对应。
TrueType 字体文件是字节压缩的这意味着所有数据类型都位于在文件中的字节偏移量。特定的填充字节包含在表定义中（如果适用）的规格。大多数编译器允许，有时会默认为字节边界以外的其他结构上的对齐方式。这意味着定义”C”语言结构，以模仿表的定义可能不兼容。
如有可能，来表示表，则此示例代码使用的结构定义。要正常工作，必须编译代码并且字节结构对齐如确保通过 pack 杂注的示例代码使用。
保证字节装箱，即使只在一个结构中读取表不正确。TrueType 字体文件使用”大字节序”或摩托罗拉样式字节排序，而英特尔的微处理器使用”小字节序”字节排序。这意味着所有大于从 TrueType 字体文件中获取一个字节必须有个字节的数据交换。交换字节为单位），可使数据兼容与英特尔的微处理器。SWAPWORD 和 SWAPLONG 宏定义提供了一个工具，以便执行此操作。

定义

“Cmap”表中的 TrueType 字体文件包含多个子表；其中的每个定义不同的编码。若要找到各个子表，该代码示例还定义了一个方便的名为 CMAPHEADERSIZE。宏是子表目录的开头的偏移量计算方便。该宏将返回用来存储在”cmap”表中的”cmap”表版本和各个子表数的两个无符号短数据类型的大小：

/*  CMAP table Data
    From the TrueType Spec revision 1.66

    USHORT  Table Version #
    USHORT  Number of encoding tables
*/ 
#define     CMAPHEADERSIZE  (sizeof(USHORT)*2)

/* CMAP table Data

From the TrueType Spec revision 1.66

USHORT Table Version #

USHORT Number of encoding tables

#define CMAPHEADERSIZE (sizeof(USHORT)*2)

每个编码的子表主”cmap”表中有一个目录条目。这是由结构定义 _CMapEncoding 表示在源代码中。此结构包含两个 ID 字段，用来区分每个子表，并从”cmap”表中的子表所在的位置开始的偏移量：

typedef struct _CMapEncoding
{
    USHORT  PlatformId;
    USHORT  EncodingId;
    ULONG   Offset;
} CMAPENCODING;

typedef struct _CMapEncoding

{

USHORT PlatformId;

USHORT EncodingId;

ULONG Offset;

} CMAPENCODING;

Win32 API 中的GetFontData函数采用一个 dword 值参数，作为表名。这样做的原因 TrueType 规范定义表名称为 4 字节标记序列。若要正确地打包到GetFontData函数调用的 dword 值参数的表名称，该代码示例定义一个名为 MAKETABLENAME。宏的工作方式通过按顺序将四个表名称标记的单个字节值转移到 dword 值数据类型：

// Macro to pack a TrueType table name into a DWORD.
#define     MAKETABLENAME(ch1, ch2, ch3, ch4) (\ 
    (((DWORD)(ch4)) &lt;&lt; 24) | \ 
    (((DWORD)(ch3)) &lt;&lt; 16) | \ 
    (((DWORD)(ch2)) &lt;&lt; 8) | \ 
    ((DWORD)(ch1)) \ 
    )

// Macro to pack a TrueType table name into a DWORD.

#define MAKETABLENAME(ch1, ch2, ch3, ch4) (\

(((DWORD)(ch4)) << 24) | \

(((DWORD)(ch3)) << 16) | \

(((DWORD)(ch2)) << 8) | \

((DWORD)(ch1)) \

)

Unicode 编码的子表标记为 3 PlatformId 和 SpecificId 的 1。这”3-1″编码是根据 TrueType 规范格式 4 子表。在源代码中定义是一种结构，_CMap4，它对应于格式 4 子表的前七位的数据类型加上一个符号的一个符号的短整型数组。数组表示子表定义多个无符号短阵列组成的平衡。通过结构定义中包含的数组符号，定义一个方便的地址到无符号的短数组的开始位置。数组符号然后用于使用它们从第一个数组的偏移量计算其他数组的起始地址。当更大的内存缓冲区包含完整的子表，来取消引用一个无符号的短数组强制转换时，这非常有用：

typedef struct _CMap4   // From the TrueType Spec. revision 1.66.
{
    USHORT format;          // Format number is set to 4. 
    USHORT length;          // Length in bytes. 
    USHORT version;         // Version number (starts at 0).
    USHORT segCountX2;      // 2 x segCount
    USHORT searchRange;     // 2 x (2**floor(log2(segCount)))
    USHORT entrySelector;   // log2(searchRange/2)
    USHORT rangeShift;      // 2 x segCount - searchRange

    USHORT Arrays[1];       // Placeholder symbol for address of arrays. following.
} CMAP4, *LPCMAP4;

typedef struct _CMap4 // From the TrueType Spec. revision 1.66.

{

USHORT format; // Format number is set to 4.

USHORT length; // Length in bytes.

USHORT version; // Version number (starts at 0).

USHORT segCountX2; // 2 x segCount

USHORT searchRange; // 2 x (2**floor(log2(segCount)))

USHORT entrySelector; // log2(searchRange/2)

USHORT rangeShift; // 2 x segCount - searchRange

USHORT Arrays[1]; // Placeholder symbol for address of arrays. following.

} CMAP4, *LPCMAP4;

进程

该代码示例实现了这两个基本任务： Unicode”cmap”子字体文件和子表，以查找 Unicode 字符代码 TrueType 标志符号索引的搜索，从表中检索。
若要从给出一个 DC 的 TrueType 字体文件检索表数据，该代码使用GetFontData函数。此函数需要到 DC TrueType 字体文件处于选中状态。若要检索表数据按指定的 TrueType 表进行索引，表 4 字节标记名称必须打包到一个 dword 值。因为我们只是想从”cmap”表中获取数据，代码示例定义了一个全局的 dword 值，dwCmapName，它汇集了”cmap”标记。编码对GetFontData函数的所有调用都使用全局 dwCmapName 变量。
GetTTUnicodeCoverage函数是 Unicode”cmap”子表中检索的源代码。它被声明为：

BOOL GetTTUnicodeCoverage ( 
    HDC hdc,            // DC with TT font.
    LPCMAP4 pBuffer,    // Properly allocated buffer.
    DWORD cbSize,       // Size of properly allocated buffer.
    DWORD *pcbNeeded    // Size of buffer needed.
    )

BOOL GetTTUnicodeCoverage (

HDC hdc, // DC with TT font.

LPCMAP4 pBuffer, // Properly allocated buffer.

DWORD cbSize, // Size of properly allocated buffer.

DWORD *pcbNeeded // Size of buffer needed.

)

此函数可检索完整 Unicode 子表 TrueType 的”cmap”表中。如果使用的缓冲区太小，调用（即，大小为零）个 pBuffer 参数声明通过 cbSize 或 NULL 参数，它运行失败并返回 FALSE。当以这种方式失败时，它将计算并返回在 pcbNeeded 参数中所需的缓冲区大小。如果此函数成功，填充 pBuffer 参数并放在 pcbNeeded 参数中复制的字节数。
此函数首先搜索子表将包含 Unicode 编码，或者”编码 3-1″或”3-0″编码。TrueType 规范的”cmap”数据类型一章中定义，这些都是格式 4 各个子表。
如果函数会查找 Unicode 编码，它使用GetFontFormat4Header函数来检索格式 4 子表中的前七个元素。然后，该代码计算返回整个 Unicode 子表所需的缓冲区大小。如果缓冲区太小或不提供，大小返回给调用方，这样他们可以分配一个适当大小的缓冲区并重新调用函数。
如果由调用方提供的缓冲区不够大，示例代码然后使用GetFontFormat4Subtable函数来检索整个子表。此函数正确地重新排序以适应英特尔微处理器的字节数。如果子表检索已成功完成，结果将被复制到调用者的缓冲区中。如果未成功完成，该代码将不会修改用户的缓冲区中，并可以安全地返回失败。通过将设置为零的字节为单位）需要的参数，它未能向缓冲区中复制字节为单位），并可以区分这种失败的缓冲区空间不足，可以指示代码示例。
一旦已获得 Unicode 子表，使用它可以要检索的字符的代码的标志符号索引，或者为了实现大量其他有用的功能。
将 Unicode 字符代码转换到标志符号索引的示例代码的GetTTUnicodeGlyphIndex函数中完成：

USHORT GetTTUnicodeGlyphIndex (
    HDC hdc,        // DC with a TrueType font selected.
    USHORT ch       // Unicode character to convert to Index.
    )

USHORT GetTTUnicodeGlyphIndex (

HDC hdc, // DC with a TrueType font selected.

USHORT ch // Unicode character to convert to Index.

)

此函数具有较简单的界面，要求仅将句柄的 DC，hdc ；其中包含的 true Type 字体和 Unicode 字符代码将转换的频道。成功时，该函数将返回 ch 的标志符号索引。如果 Unicode 字符代码不在中的编码（即，某个文件夹下没有任何标志符号）返回缺少的标志符号索引值为零。
它第一次通过分配缓冲区和调用GetTTUnicodeCoverage函数来检索 Unicode 子表。如果发生错误，该示例代码无法通过返回缺失标志符号索引进行调用。在这种情况下，故障可能意味着 DC 不包含 TrueType 字体，或者 TrueType 字体不包含适当的 Unicode 子表。
接下来，该代码将尝试查找 Unicode 字符代码中的编码。每个格式 4 子表引用中的 TrueType 规范的”cmap”数据类型章中执行搜索。FindFormat4Segment函数进行线性搜索子表中的代码段。如果任何代码段不方括号此字符代码，然后该字体文件不包含编码并因此没有任何标志符号。该代码然后返回缺失标志符号的索引。
标志符号索引的查询出现在GetTTUnicodeGlyphIndex函数的最后一个部分。有两种方法来查找特定的标志符号的标志符号索引。这两种情况下使用 idRangeOffset 阵列，通过检查在其中找到的字符代码段的序号索引处的值。
第一种情况下，如果位于段 idRangeOffset 数组的索引处的值为零，该代码取消引用具有相同的数组索引的 idDelta 数组中，并将转换为使用取模运算的标志符号索引：

// Per TT spec, if the RangeOffset is zero,
    if ( idRangeOffset[iSegment] == 0)
    {
        // calculate the glyph index directly.
        GlyphIndex = (idDelta[iSegment] + ch) % 65536;
    }
    else
    {
     ...
    }

// Per TT spec, if the RangeOffset is zero,

if ( idRangeOffset[iSegment] == 0)

{

// calculate the glyph index directly.

GlyphIndex = (idDelta[iSegment] + ch) % 65536;

}

else

{

...

}

第二种情况下，段的序号索引处的值是索引的标志符号索引查找表中的一部分。隐蔽索引这一轮，在 idRangeOffset 元素中使用的值的地址根据的顺序和位置的各个子表的数组，返回的中间的 ID 值。TrueType 规范的格式 4 子表一章中，索引的机制进行了说明。如果非零值，此值然后添加到 idDelta 值并转换使用取模运算；否则为没有任何标志符号并返回缺失标志符号索引：

// Per TT spec, if the RangeOffset is zero,
    if ( idRangeOffset[iSegment] == 0)
    {
     ...
    }
    else
    {
        // otherwise, use the glyph ID array to get the index.
        USHORT idResult;    //Intermediate ID calc.

        idResult = *(
            idRangeOffset[iSegment]/2 + 
            (ch - startCount[iSegment]) + 
            &amp;idRangeOffset[iSegment]
            );  // Indexing equation from TT spec.
        if (idResult)
            // Per TT spec, nonzero means there is a glyph.
            GlyphIndex = (idDelta[iSegment] + idResult) % 65536;
        else
            // Otherwise, return the missing glyph.
            GlyphIndex = 0;
    }

// Per TT spec, if the RangeOffset is zero,

if ( idRangeOffset[iSegment] == 0)

{

...

}

else

{

// otherwise, use the glyph ID array to get the index.

USHORT idResult; //Intermediate ID calc.

idResult = *(

idRangeOffset[iSegment]/2 +

(ch - startCount[iSegment]) +

&idRangeOffset[iSegment]

); // Indexing equation from TT spec.

if (idResult)

// Per TT spec, nonzero means there is a glyph.

GlyphIndex = (idDelta[iSegment] + idResult) % 65536;

else

// Otherwise, return the missing glyph.

GlyphIndex = 0;

}

其他有用的函数可从 Unicode 子表的解码。例如，此示例代码实现调用的函数：

USHORT GetTTUnicodeCharCount ( 
    HDC hdc
    )

USHORT GetTTUnicodeCharCount (

HDC hdc

)

此函数会添加格式 4 子表，以查找 Unicode 字符代码在 TrueType 字体文件中表示的总数中的每个段所涵盖的字符组成。但是请注意此函数必须测试每个单个字符代码，如果涉及一段使用的标志符号 ID 数组的映射，而不是连续的。
这也是指导您请注意不一定为标志符号的字体文件中包含的数字等效的 Unicode 字符编码映射到一个标志符号的计数。如果 Unicode 编码映射到同一个标志符号的多个字符代码，则可能存在更少的标志符号。也可能是多个字形字体文件中的映射建议比。例如： TrueType 打开（现在称为 OpenType 版式）表定义标志符号索引替换到多个备用标志符号。
GetTTUnicodeGlyphIndex函数还可用于实现的函数，以确定给定的 TrueType 字体中是否包含给定的 Unicode 字符代码的标志符号。只需调用GetTTUnicodeGlyphIndex函数中使用的字符代码，并测试缺失标志符号索引（值为 0）与等同的返回。

实施简介

此示例代码被编写为清楚起见，解释。它不非常适合重复使用，因为它在分配和检索 TrueType 表每次调用时的公共函数。用于实际应用程序中，是一种很好的优化，缓存的 Unicode 编码 TrueType 字体文件，只要它仍然在 DC 中。应用程序可以进行比较以查看 DC 到选定的字体是否相同的 TrueType 字体文件缓存，并将字体文件的校验和值进行比较。此校验和位于 TrueType 字体文件，该文件的开头的表目录中，并可以通过使用GetFontData函数来检索。请参见”表目录”下数据类型一章来查找字体文件的校验和 TrueType 规范的讨论。

完整的源代码

#pragma pack(1)     // for byte alignment
// We need byte alignment to be structure compatible with the
// contents of a TrueType font file

// Macros to swap from Big Endian to Little Endian
#define SWAPWORD(x) MAKEWORD( \ 
    HIBYTE(x), \ 
    LOBYTE(x) \ 
    )
#define SWAPLONG(x) MAKELONG( \ 
    SWAPWORD(HIWORD(x)), \ 
    SWAPWORD(LOWORD(x)) \ 
    )

typedef struct _CMap4   // From the TrueType Spec. revision 1.66
{
    USHORT format;          // Format number is set to 4. 
    USHORT length;          // Length in bytes. 
    USHORT version;         // Version number (starts at 0).
    USHORT segCountX2;      // 2 x segCount.
    USHORT searchRange;     // 2 x (2**floor(log2(segCount)))
    USHORT entrySelector;   // log2(searchRange/2)
    USHORT rangeShift;      // 2 x segCount - searchRange

    USHORT Arrays[1];       // Placeholder symbol for address of arrays following
} CMAP4, *LPCMAP4;

/*  CMAP table Data
    From the TrueType Spec revision 1.66

    USHORT  Table Version #
    USHORT  Number of encoding tables
*/ 
#define     CMAPHEADERSIZE  (sizeof(USHORT)*2)

/*  ENCODING entry Data aka CMAPENCODING
    From the TrueType Spec revision 1.66

    USHORT  Platform Id
    USHORT  Platform Specific Encoding Id
    ULONG   Byte Offset from beginning of table
*/ 
#define     ENCODINGSIZE    (sizeof(USHORT)*2 + sizeof(ULONG))

typedef struct _CMapEncoding
{
    USHORT  PlatformId;
    USHORT  EncodingId;
    ULONG   Offset;
} CMAPENCODING;

// Macro to pack a TrueType table name into a DWORD
#define     MAKETABLENAME(ch1, ch2, ch3, ch4) (\ 
    (((DWORD)(ch4)) &lt;&lt; 24) | \ 
    (((DWORD)(ch3)) &lt;&lt; 16) | \ 
    (((DWORD)(ch2)) &lt;&lt; 8) | \ 
    ((DWORD)(ch1)) \ 
    )

/* public functions */ 
USHORT GetTTUnicodeGlyphIndex(HDC hdc, USHORT ch);
USHORT GetTTUnicodeCharCount(HDC hdc);

// DWORD packed four letter table name for each GetFontData()
// function call when working with the CMAP TrueType table
DWORD dwCmapName = MAKETABLENAME( 'c','m','a','p' );

USHORT *GetEndCountArray(LPBYTE pBuff)
{
    return (USHORT *)(pBuff + 7 * sizeof(USHORT));  // Per TT spec
}

USHORT *GetStartCountArray(LPBYTE pBuff)
{
    DWORD   segCount = ((LPCMAP4)pBuff)-&gt;segCountX2/2;
    return (USHORT *)( pBuff + 
        8 * sizeof(USHORT) +        // 7 header + 1 reserved USHORT
        segCount*sizeof(USHORT) );  // Per TT spec
}

USHORT *GetIdDeltaArray(LPBYTE pBuff)
{
    DWORD   segCount = ((LPCMAP4)pBuff)-&gt;segCountX2/2;
    return (USHORT *)( pBuff + 
        8 * sizeof(USHORT) +        // 7 header + 1 reserved USHORT
        segCount * 2 * sizeof(USHORT) );    // Per TT spec
}

USHORT *GetIdRangeOffsetArray(LPBYTE pBuff)
{
    DWORD   segCount = ((LPCMAP4)pBuff)-&gt;segCountX2/2;
    return (USHORT *)( pBuff + 
        8 * sizeof(USHORT) +        // 7 header + 1 reserved USHORT
        segCount * 3 * sizeof(USHORT) );    // Per TT spec
}

void SwapArrays( LPCMAP4 pFormat4 )
{
    DWORD   segCount = pFormat4-&gt;segCountX2/2;  // Per TT Spec
    DWORD   i;
    USHORT  *pGlyphId, 
            *pEndOfBuffer, 
            *pstartCount    = GetStartCountArray( (LPBYTE)pFormat4 ), 
            *pidDelta       = GetIdDeltaArray( (LPBYTE)pFormat4 ), 
            *pidRangeOffset = GetIdRangeOffsetArray( (LPBYTE)pFormat4 ), 
            *pendCount      = GetEndCountArray( (LPBYTE)pFormat4 );

    // Swap the array elements for Intel.
    for (i=0; i &lt; segCount; i++)
    {
        pendCount[i] = SWAPWORD(pendCount[i]);
        pstartCount[i] = SWAPWORD(pstartCount[i]);
        pidDelta[i] = SWAPWORD(pidDelta[i]);
        pidRangeOffset[i] = SWAPWORD(pidRangeOffset[i]);
    }

    // Swap the Glyph Id array
    pGlyphId = pidRangeOffset + segCount;   // Per TT spec
    pEndOfBuffer = (USHORT*)((LPBYTE)pFormat4 + pFormat4-&gt;length);
    for (;pGlyphId &lt; pEndOfBuffer; pGlyphId++)
    {
        *pGlyphId = SWAPWORD(*pGlyphId);
    }
} /* end of function SwapArrays */ 

BOOL GetFontEncoding ( 
    HDC hdc, 
    CMAPENCODING * pEncoding, 
    int iEncoding 
    )
/*
    Note for this function to work correctly, structures must 
    have byte alignment.
*/ 
{
    DWORD   dwResult;
    BOOL    fSuccess = TRUE;

    // Get the structure data from the TrueType font
    dwResult = GetFontData ( 
        hdc, 
        dwCmapName, 
        CMAPHEADERSIZE + ENCODINGSIZE*iEncoding, 
        pEncoding, 
        sizeof(CMAPENCODING) );
    fSuccess = (dwResult == sizeof(CMAPENCODING));

    // swap the Platform Id for Intel
    pEncoding-&gt;PlatformId = SWAPWORD(pEncoding-&gt;PlatformId);

    // swap the Specific Id for Intel
    pEncoding-&gt;EncodingId = SWAPWORD(pEncoding-&gt;EncodingId);

    // swap the subtable offset for Intel
    pEncoding-&gt;Offset = SWAPLONG(pEncoding-&gt;Offset);

    return fSuccess;

} /* end of function GetFontEncoding */ 

BOOL GetFontFormat4Header ( 
    HDC hdc, 
    LPCMAP4 pFormat4, 
    DWORD dwOffset 
    )
/*
    Note for this function to work correctly, structures must 
    have byte alignment.
*/ 
{
    BOOL    fSuccess = TRUE;
    DWORD   dwResult;
    int     i;
    USHORT  *pField;

    // Loop and Alias a writeable pointer to the field of interest
    pField = (USHORT *)pFormat4;

    for (i=0; i &lt; 7; i++)
    {
        // Get the field from the subtable
        dwResult = GetFontData ( 
            hdc, 
            dwCmapName, 
            dwOffset + sizeof(USHORT)*i, 
            pField, 
            sizeof(USHORT) );

        // swap it to make it right for Intel.
        *pField = SWAPWORD(*pField);
        // move on to the next
        pField++;
        // accumulate our success
        fSuccess = (dwResult == sizeof(USHORT)) &amp;&amp; fSuccess;
    }

    return fSuccess;

} /* end of function GetFontFormat4Header */ 

BOOL GetFontFormat4Subtable ( 
    HDC hdc,                    // DC with TrueType font
    LPCMAP4 pFormat4Subtable,   // destination buffer
    DWORD   dwOffset            // Offset within font
    )
{
    DWORD   dwResult;
    USHORT  length;

    // Retrieve the header values in swapped order
    if (!GetFontFormat4Header ( hdc, 
        pFormat4Subtable, 
        dwOffset ))
    {
        return FALSE;
    }

    // Get the rest of the table
    length = pFormat4Subtable-&gt;length - (7 * sizeof(USHORT));
    dwResult = GetFontData( hdc, 
        dwCmapName,
        dwOffset + 7 * sizeof(USHORT),      // pos of arrays
        (LPBYTE)pFormat4Subtable-&gt;Arrays,   // destination
        length );       

    if ( dwResult != length)
    {
        // We really shouldn't ever get here
        return FALSE;
    }

    // Swamp the arrays
    SwapArrays( pFormat4Subtable );

    return TRUE;
}

USHORT GetFontFormat4CharCount (
    LPCMAP4 pFormat4    // pointer to a valid Format4 subtable
    )
{
    USHORT  i,
            *pendCount = GetEndCountArray((LPBYTE) pFormat4),
            *pstartCount = GetStartCountArray((LPBYTE) pFormat4),
            *idRangeOffset = GetIdRangeOffsetArray( (LPBYTE) pFormat4 );

    // Count the # of glyphs
    USHORT nGlyphs = 0;

    if ( pFormat4 == NULL )
        return 0;

    // by adding up the coverage of each segment
    for (i=0; i &lt; (pFormat4-&gt;segCountX2/2); i++)
    {

        if ( idRangeOffset[i] == 0)
        {
            // if per the TT spec, the idRangeOffset element is zero,
            // all of the characters in this segment exist.
            nGlyphs += pendCount[i] - pstartCount[i] +1;
        }
        else
        {
            // otherwise we have to test for glyph existence for
            // each character in the segment.
            USHORT idResult;    //Intermediate id calc.
            USHORT ch;

            for (ch = pstartCount[i]; ch &lt;= pendCount[i]; ch++)
            {
                // determine if a glyph exists
                idResult = *(
                    idRangeOffset[i]/2 + 
                    (ch - pstartCount[i]) + 
                    &amp;idRangeOffset[i]
                    );  // indexing equation from TT spec
                if (idResult != 0)
                    // Yep, count it.
                    nGlyphs++;
            }
        }
    }

    return nGlyphs;
} /* end of function GetFontFormat4CharCount */ 

BOOL GetTTUnicodeCoverage ( 
    HDC hdc,            // DC with TT font
    LPCMAP4 pBuffer,    // Properly allocated buffer
    DWORD cbSize,       // Size of properly allocated buffer
    DWORD *pcbNeeded    // size of buffer needed
    )
/*
    if cbSize is to small or zero, or if pBuffer is NULL the function
    will fail and return the required buffer size in *pcbNeeded.

    if another error occurs, the function will fail and *pcbNeeded will
    be zero.

    When the function succeeds, *pcbNeeded contains the number of bytes 
    copied to pBuffer.
*/ 
{
    USHORT          nEncodings;     // # of encoding in the TT font
    CMAPENCODING    Encoding;       // The current encoding
    DWORD           dwResult;
    DWORD           i, 
                    iUnicode;       // The Unicode encoding
    CMAP4           Format4;        // Unicode subtable format
    LPCMAP4         pFormat4Subtable;   // Working buffer for subtable

    // Get the number of subtables in the CMAP table from the CMAP header
    // The # of subtables is the second USHORT in the CMAP table, per the TT Spec.
    dwResult = GetFontData ( hdc, dwCmapName, sizeof(USHORT), &amp;nEncodings, sizeof(USHORT) );
    nEncodings = SWAPWORD(nEncodings);

    if ( dwResult != sizeof(USHORT) )
    {
        // Something is wrong, we probably got GDI_ERROR back
        // Probably this means that the Device Context does not have
        // a TrueType font selected into it.
        return FALSE;
    }

    // Get the encodings and look for a Unicode Encoding
    iUnicode = nEncodings;
    for (i=0; i &lt; nEncodings; i++)
    {
        // Get the encoding entry for each encoding
        if (!GetFontEncoding ( hdc, &amp;Encoding, i ))
        {
            *pcbNeeded = 0;
            return FALSE;
        }

        // Take note of the Unicode encoding.
        // 
        // A Unicode encoding per the TrueType specification has a
        // Platform Id of 3 and a Platform specific encoding id of 1
        // Note that Symbol fonts are supposed to have a Platform Id of 3 
        // and a specific id of 0. If the TrueType spec. suggestions were
        // followed then the Symbol font's Format 4 encoding could also
        // be considered Unicode because the mapping would be in the
        // Private Use Area of Unicode. We assume this here and allow 
        // Symbol fonts to be interpreted. If they do not contain a 
        // Format 4, we bail later. If they do not have a Unicode 
        // character mapping, we'll get wrong results.
        // Code could infer from the coverage whether 3-0 fonts are 
        // Unicode or not by examining the segments for placement within
        // the Private Use Area Subrange.
        if (Encoding.PlatformId == 3 &amp;&amp; 
            (Encoding.EncodingId == 1 || Encoding.EncodingId == 0) )
        {
            iUnicode = i;       // Set the index to the Unicode encoding
        }
    }

    // index out of range means failure to find a Unicode mapping
    if (iUnicode &gt;= nEncodings)
    {
        // No Unicode encoding found.
        *pcbNeeded = 0;
        return FALSE;
    }

    // Get the header entries(first 7 USHORTs) for the Unicode encoding.
    if ( !GetFontFormat4Header ( hdc, &amp;Format4, Encoding.Offset ) )
    {
        *pcbNeeded = 0;
        return FALSE;
    }

    // Check to see if we retrieved a Format 4 table 
    if ( Format4.format != 4 )
    {
        // Bad, subtable is not format 4, bail.
        // This could happen if the font is corrupt
        // It could also happen if there is a new font format we
        // don't understand.
        *pcbNeeded = 0;
        return FALSE;
    }

    // Figure buffer size and tell caller if buffer to small
    *pcbNeeded = Format4.length;    
    if (*pcbNeeded &gt; cbSize || pBuffer == NULL)
    {
        // Either test indicates caller needs to know
        // the buffer size and the parameters are not setup
        // to continue.
        return FALSE;
    }

    // allocate a full working buffer
    pFormat4Subtable = (LPCMAP4)malloc ( Format4.length );
    if ( pFormat4Subtable == NULL)
    {
        // Bad things happening if we can't allocate memory
        *pcbNeeded = 0;
        return FALSE;
    }

    // get the entire subtable
    if (!GetFontFormat4Subtable ( hdc, pFormat4Subtable, Encoding.Offset ))
    {
        // Bad things happening if we can't allocate memory
        *pcbNeeded = 0;
        return FALSE;
    }

    // Copy the retrieved table into the buffer
    CopyMemory( pBuffer, 
        pFormat4Subtable, 
        pFormat4Subtable-&gt;length );

    free ( pFormat4Subtable );
    return TRUE;
} /* end of function GetTTUnicodeCoverage */ 

BOOL FindFormat4Segment (
    LPCMAP4 pTable,     // a valid Format4 subtable buffer
    USHORT ch,          // Unicode character to search for
    USHORT *piSeg       // out: index of segment containing ch
    )
/*
    if the Unicode character ch is not contained in one of the 
    segments the function returns FALSE.

    if the Unicode character ch is found in a segment, the index
    of the segment is placed in*piSeg and the function returns
    TRUE.
*/ 
{
    USHORT  i, 
            segCount = pTable-&gt;segCountX2/2;
    USHORT  *pendCount = GetEndCountArray((LPBYTE) pTable);
    USHORT  *pstartCount = GetStartCountArray((LPBYTE) pTable);

    // Find segment that could contain the Unicode character code
    for (i=0; i &lt; segCount &amp;&amp; pendCount[i] &lt; ch; i++);

    // We looked in them all, ch not there
    if (i &gt;= segCount)
        return FALSE;

    // character code not within the range of the segment
    if (pstartCount[i] &gt; ch)
        return FALSE;

    // this segment contains the character code
    *piSeg = i;
    return TRUE;
} /* end of function FindFormat4Segment */ 

USHORT GetTTUnicodeCharCount ( 
    HDC hdc
    )
/*
    Returns the number of Unicode character glyphs that 
    are in the TrueType font that is selected into the hdc.
*/ 
{
    LPCMAP4 pUnicodeCMapTable;
    USHORT  cChar;
    DWORD   dwSize;

    // Get the Unicode CMAP table from the TT font
    GetTTUnicodeCoverage( hdc, NULL, 0, &amp;dwSize );
    pUnicodeCMapTable = (LPCMAP4)malloc( dwSize );
    if (!GetTTUnicodeCoverage( hdc, pUnicodeCMapTable, dwSize, &amp;dwSize ))
    {
        // possibly no Unicode cmap, not a TT font selected,...
        free( pUnicodeCMapTable );
        return 0;
    }

    cChar = GetFontFormat4CharCount( pUnicodeCMapTable );
    free( pUnicodeCMapTable );

    return cChar;
} /* end of function GetTTUnicodeCharCount */ 

USHORT GetTTUnicodeGlyphIndex (
    HDC hdc,        // DC with a TrueType font selected
    USHORT ch       // Unicode character to convert to Index
    )
/*
    When the TrueType font contains a glyph for ch, the
    function returns the glyph index for that character.

    If an error occurs, or there is no glyph for ch, the
    function will return the missing glyph index of zero.
*/ 
{
    LPCMAP4 pUnicodeCMapTable;
    DWORD   dwSize;
    USHORT  iSegment;
    USHORT  *idRangeOffset;
    USHORT  *idDelta;
    USHORT  *startCount;
    USHORT  GlyphIndex = 0;     // Initialize to missing glyph

    // How big a buffer do we need for Unicode CMAP?
    GetTTUnicodeCoverage( hdc, NULL, 0, &amp;dwSize );
    pUnicodeCMapTable = (LPCMAP4)malloc( dwSize );
    if (!GetTTUnicodeCoverage( hdc, pUnicodeCMapTable, dwSize, &amp;dwSize ))
    {
        // Either no Unicode cmap, or some other error occurred
        // like font in DC is not TT.
        free( pUnicodeCMapTable );
        return 0;       // return missing glyph on error
    }

    // Find the cmap segment that has the character code.
    if (!FindFormat4Segment( pUnicodeCMapTable, ch, &amp;iSegment ))
    {
        free( pUnicodeCMapTable );
        return 0;       // ch not in cmap, return missing glyph
    }

    // Get pointers to the cmap data
    idRangeOffset = GetIdRangeOffsetArray( (LPBYTE) pUnicodeCMapTable );
    idDelta = GetIdDeltaArray( (LPBYTE) pUnicodeCMapTable );
    startCount = GetStartCountArray( (LPBYTE) pUnicodeCMapTable );

    // Per TT spec, if the RangeOffset is zero,
    if ( idRangeOffset[iSegment] == 0)
    {
        // calculate the glyph index directly
        GlyphIndex = (idDelta[iSegment] + ch) % 65536;
    }
    else
    {
        // otherwise, use the glyph id array to get the index
        USHORT idResult;    //Intermediate id calc.

        idResult = *(
            idRangeOffset[iSegment]/2 + 
            (ch - startCount[iSegment]) + 
            &amp;idRangeOffset[iSegment]
            );  // indexing equation from TT spec
        if (idResult)
            // Per TT spec, nonzero means there is a glyph
            GlyphIndex = (idDelta[iSegment] + idResult) % 65536;
        else
            // otherwise, return the missing glyph
            GlyphIndex = 0;
    }

    free( pUnicodeCMapTable );
    return GlyphIndex;
} /* end of function GetTTUnicodeGlyphIndex */

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

551

552

553

#pragma pack(1) // for byte alignment

// We need byte alignment to be structure compatible with the

// contents of a TrueType font file

// Macros to swap from Big Endian to Little Endian

#define SWAPWORD(x) MAKEWORD( \

HIBYTE(x), \

LOBYTE(x) \

)

#define SWAPLONG(x) MAKELONG( \

SWAPWORD(HIWORD(x)), \

SWAPWORD(LOWORD(x)) \

)

typedef struct _CMap4 // From the TrueType Spec. revision 1.66

{

USHORT format; // Format number is set to 4.

USHORT length; // Length in bytes.

USHORT version; // Version number (starts at 0).

USHORT segCountX2; // 2 x segCount.

USHORT searchRange; // 2 x (2**floor(log2(segCount)))

USHORT entrySelector; // log2(searchRange/2)

USHORT rangeShift; // 2 x segCount - searchRange

USHORT Arrays[1]; // Placeholder symbol for address of arrays following

} CMAP4, *LPCMAP4;

/* CMAP table Data

From the TrueType Spec revision 1.66

USHORT Table Version #

USHORT Number of encoding tables

#define CMAPHEADERSIZE (sizeof(USHORT)*2)

/* ENCODING entry Data aka CMAPENCODING

From the TrueType Spec revision 1.66

USHORT Platform Id

USHORT Platform Specific Encoding Id

ULONG Byte Offset from beginning of table

#define ENCODINGSIZE (sizeof(USHORT)*2 + sizeof(ULONG))

typedef struct _CMapEncoding

{

USHORT PlatformId;

USHORT EncodingId;

ULONG Offset;

} CMAPENCODING;

// Macro to pack a TrueType table name into a DWORD

#define MAKETABLENAME(ch1, ch2, ch3, ch4) (\

(((DWORD)(ch4)) << 24) | \

(((DWORD)(ch3)) << 16) | \

(((DWORD)(ch2)) << 8) | \

((DWORD)(ch1)) \

)

/* public functions */

USHORT GetTTUnicodeGlyphIndex(HDC hdc, USHORT ch);

USHORT GetTTUnicodeCharCount(HDC hdc);

// DWORD packed four letter table name for each GetFontData()

// function call when working with the CMAP TrueType table

DWORD dwCmapName = MAKETABLENAME( 'c','m','a','p' );

USHORT *GetEndCountArray(LPBYTE pBuff)

{

return (USHORT *)(pBuff + 7 * sizeof(USHORT)); // Per TT spec

}

USHORT *GetStartCountArray(LPBYTE pBuff)

{

DWORD segCount = ((LPCMAP4)pBuff)->segCountX2/2;

return (USHORT *)( pBuff +

8 * sizeof(USHORT) + // 7 header + 1 reserved USHORT

segCount*sizeof(USHORT) ); // Per TT spec

}

USHORT *GetIdDeltaArray(LPBYTE pBuff)

{

DWORD segCount = ((LPCMAP4)pBuff)->segCountX2/2;

return (USHORT *)( pBuff +

8 * sizeof(USHORT) + // 7 header + 1 reserved USHORT

segCount * 2 * sizeof(USHORT) ); // Per TT spec

}

USHORT *GetIdRangeOffsetArray(LPBYTE pBuff)

{

DWORD segCount = ((LPCMAP4)pBuff)->segCountX2/2;

return (USHORT *)( pBuff +

8 * sizeof(USHORT) + // 7 header + 1 reserved USHORT

segCount * 3 * sizeof(USHORT) ); // Per TT spec

}

void SwapArrays( LPCMAP4 pFormat4 )

{

DWORD segCount = pFormat4->segCountX2/2; // Per TT Spec

DWORD i;

USHORT *pGlyphId,

*pEndOfBuffer,

*pstartCount = GetStartCountArray( (LPBYTE)pFormat4 ),

*pidDelta = GetIdDeltaArray( (LPBYTE)pFormat4 ),

*pidRangeOffset = GetIdRangeOffsetArray( (LPBYTE)pFormat4 ),

*pendCount = GetEndCountArray( (LPBYTE)pFormat4 );

// Swap the array elements for Intel.

for (i=0; i < segCount; i++)

{

pendCount[i] = SWAPWORD(pendCount[i]);

pstartCount[i] = SWAPWORD(pstartCount[i]);

pidDelta[i] = SWAPWORD(pidDelta[i]);

pidRangeOffset[i] = SWAPWORD(pidRangeOffset[i]);

}

// Swap the Glyph Id array

pGlyphId = pidRangeOffset + segCount; // Per TT spec

pEndOfBuffer = (USHORT*)((LPBYTE)pFormat4 + pFormat4->length);

for (;pGlyphId < pEndOfBuffer; pGlyphId++)

{

*pGlyphId = SWAPWORD(*pGlyphId);

}

} /* end of function SwapArrays */

BOOL GetFontEncoding (

HDC hdc,

CMAPENCODING * pEncoding,

int iEncoding

)

Note for this function to work correctly, structures must

have byte alignment.

{

DWORD dwResult;

BOOL fSuccess = TRUE;

// Get the structure data from the TrueType font

dwResult = GetFontData (

hdc,

dwCmapName,

CMAPHEADERSIZE + ENCODINGSIZE*iEncoding,

pEncoding,

sizeof(CMAPENCODING) );

fSuccess = (dwResult == sizeof(CMAPENCODING));

// swap the Platform Id for Intel

pEncoding->PlatformId = SWAPWORD(pEncoding->PlatformId);

// swap the Specific Id for Intel

pEncoding->EncodingId = SWAPWORD(pEncoding->EncodingId);

// swap the subtable offset for Intel

pEncoding->Offset = SWAPLONG(pEncoding->Offset);

return fSuccess;

} /* end of function GetFontEncoding */

BOOL GetFontFormat4Header (

HDC hdc,

LPCMAP4 pFormat4,

DWORD dwOffset

)

Note for this function to work correctly, structures must

have byte alignment.

{

BOOL fSuccess = TRUE;

DWORD dwResult;

int i;

USHORT *pField;

// Loop and Alias a writeable pointer to the field of interest

pField = (USHORT *)pFormat4;

for (i=0; i < 7; i++)

{

// Get the field from the subtable

dwResult = GetFontData (

hdc,

dwCmapName,

dwOffset + sizeof(USHORT)*i,

pField,

sizeof(USHORT) );

// swap it to make it right for Intel.

*pField = SWAPWORD(*pField);

// move on to the next

pField++;

// accumulate our success

fSuccess = (dwResult == sizeof(USHORT)) && fSuccess;

}

return fSuccess;

} /* end of function GetFontFormat4Header */

BOOL GetFontFormat4Subtable (

HDC hdc, // DC with TrueType font

LPCMAP4 pFormat4Subtable, // destination buffer

DWORD dwOffset // Offset within font

)

{

DWORD dwResult;

USHORT length;

// Retrieve the header values in swapped order

if (!GetFontFormat4Header ( hdc,

pFormat4Subtable,

dwOffset ))

{

return FALSE;

}

// Get the rest of the table

length = pFormat4Subtable->length - (7 * sizeof(USHORT));

dwResult = GetFontData( hdc,

dwCmapName,

dwOffset + 7 * sizeof(USHORT), // pos of arrays

(LPBYTE)pFormat4Subtable->Arrays, // destination

length );

if ( dwResult != length)

{

// We really shouldn't ever get here

return FALSE;

}

// Swamp the arrays

SwapArrays( pFormat4Subtable );

return TRUE;

}

USHORT GetFontFormat4CharCount (

LPCMAP4 pFormat4 // pointer to a valid Format4 subtable

)

{

USHORT i,

*pendCount = GetEndCountArray((LPBYTE) pFormat4),

*pstartCount = GetStartCountArray((LPBYTE) pFormat4),

*idRangeOffset = GetIdRangeOffsetArray( (LPBYTE) pFormat4 );

// Count the # of glyphs

USHORT nGlyphs = 0;

if ( pFormat4 == NULL )

return 0;

// by adding up the coverage of each segment

for (i=0; i < (pFormat4->segCountX2/2); i++)

{

if ( idRangeOffset[i] == 0)

{

// if per the TT spec, the idRangeOffset element is zero,

// all of the characters in this segment exist.

nGlyphs += pendCount[i] - pstartCount[i] +1;

}

else

{

// otherwise we have to test for glyph existence for

// each character in the segment.

USHORT idResult; //Intermediate id calc.

USHORT ch;

for (ch = pstartCount[i]; ch <= pendCount[i]; ch++)

{

// determine if a glyph exists

idResult = *(

idRangeOffset[i]/2 +

(ch - pstartCount[i]) +

&idRangeOffset[i]

); // indexing equation from TT spec

if (idResult != 0)

// Yep, count it.

nGlyphs++;

}

return nGlyphs;

} /* end of function GetFontFormat4CharCount */

BOOL GetTTUnicodeCoverage (

HDC hdc, // DC with TT font

LPCMAP4 pBuffer, // Properly allocated buffer

DWORD cbSize, // Size of properly allocated buffer

DWORD *pcbNeeded // size of buffer needed

)

if cbSize is to small or zero, or if pBuffer is NULL the function

will fail and return the required buffer size in *pcbNeeded.

if another error occurs, the function will fail and *pcbNeeded will

be zero.

When the function succeeds, *pcbNeeded contains the number of bytes

copied to pBuffer.

{

USHORT nEncodings; // # of encoding in the TT font

CMAPENCODING Encoding; // The current encoding

DWORD dwResult;

DWORD i,

iUnicode; // The Unicode encoding

CMAP4 Format4; // Unicode subtable format

LPCMAP4 pFormat4Subtable; // Working buffer for subtable

// Get the number of subtables in the CMAP table from the CMAP header

// The # of subtables is the second USHORT in the CMAP table, per the TT Spec.

dwResult = GetFontData ( hdc, dwCmapName, sizeof(USHORT), &nEncodings, sizeof(USHORT) );

nEncodings = SWAPWORD(nEncodings);

if ( dwResult != sizeof(USHORT) )

{

// Something is wrong, we probably got GDI_ERROR back

// Probably this means that the Device Context does not have

// a TrueType font selected into it.

return FALSE;

}

// Get the encodings and look for a Unicode Encoding

iUnicode = nEncodings;

for (i=0; i < nEncodings; i++)

{

// Get the encoding entry for each encoding

if (!GetFontEncoding ( hdc, &Encoding, i ))

{

*pcbNeeded = 0;

return FALSE;

}

// Take note of the Unicode encoding.

// A Unicode encoding per the TrueType specification has a

// Platform Id of 3 and a Platform specific encoding id of 1

// Note that Symbol fonts are supposed to have a Platform Id of 3

// and a specific id of 0. If the TrueType spec. suggestions were

// followed then the Symbol font's Format 4 encoding could also

// be considered Unicode because the mapping would be in the

// Private Use Area of Unicode. We assume this here and allow

// Symbol fonts to be interpreted. If they do not contain a

// Format 4, we bail later. If they do not have a Unicode

// character mapping, we'll get wrong results.

// Code could infer from the coverage whether 3-0 fonts are

// Unicode or not by examining the segments for placement within

// the Private Use Area Subrange.

if (Encoding.PlatformId == 3 &&

(Encoding.EncodingId == 1 || Encoding.EncodingId == 0) )

{

iUnicode = i; // Set the index to the Unicode encoding

}

// index out of range means failure to find a Unicode mapping

if (iUnicode >= nEncodings)

{

// No Unicode encoding found.

*pcbNeeded = 0;

return FALSE;

}

// Get the header entries(first 7 USHORTs) for the Unicode encoding.

if ( !GetFontFormat4Header ( hdc, &Format4, Encoding.Offset ) )

{

*pcbNeeded = 0;

return FALSE;

}

// Check to see if we retrieved a Format 4 table

if ( Format4.format != 4 )

{

// Bad, subtable is not format 4, bail.

// This could happen if the font is corrupt

// It could also happen if there is a new font format we

// don't understand.

*pcbNeeded = 0;

return FALSE;

}

// Figure buffer size and tell caller if buffer to small

*pcbNeeded = Format4.length;

if (*pcbNeeded > cbSize || pBuffer == NULL)

{

// Either test indicates caller needs to know

// the buffer size and the parameters are not setup

// to continue.

return FALSE;

}

// allocate a full working buffer

pFormat4Subtable = (LPCMAP4)malloc ( Format4.length );

if ( pFormat4Subtable == NULL)

{

// Bad things happening if we can't allocate memory

*pcbNeeded = 0;

return FALSE;

}

// get the entire subtable

if (!GetFontFormat4Subtable ( hdc, pFormat4Subtable, Encoding.Offset ))

{

// Bad things happening if we can't allocate memory

*pcbNeeded = 0;

return FALSE;

}

// Copy the retrieved table into the buffer

CopyMemory( pBuffer,

pFormat4Subtable,

pFormat4Subtable->length );

free ( pFormat4Subtable );

return TRUE;

} /* end of function GetTTUnicodeCoverage */

BOOL FindFormat4Segment (

LPCMAP4 pTable, // a valid Format4 subtable buffer

USHORT ch, // Unicode character to search for

USHORT *piSeg // out: index of segment containing ch

)

if the Unicode character ch is not contained in one of the

segments the function returns FALSE.

if the Unicode character ch is found in a segment, the index

of the segment is placed in*piSeg and the function returns

TRUE.

{

USHORT i,

segCount = pTable->segCountX2/2;

USHORT *pendCount = GetEndCountArray((LPBYTE) pTable);

USHORT *pstartCount = GetStartCountArray((LPBYTE) pTable);

// Find segment that could contain the Unicode character code

for (i=0; i < segCount && pendCount[i] < ch; i++);

// We looked in them all, ch not there

if (i >= segCount)

return FALSE;

// character code not within the range of the segment

if (pstartCount[i] > ch)

return FALSE;

// this segment contains the character code

*piSeg = i;

return TRUE;

} /* end of function FindFormat4Segment */

USHORT GetTTUnicodeCharCount (

HDC hdc

)

Returns the number of Unicode character glyphs that

are in the TrueType font that is selected into the hdc.

{

LPCMAP4 pUnicodeCMapTable;

USHORT cChar;

DWORD dwSize;

// Get the Unicode CMAP table from the TT font

GetTTUnicodeCoverage( hdc, NULL, 0, &dwSize );

pUnicodeCMapTable = (LPCMAP4)malloc( dwSize );

if (!GetTTUnicodeCoverage( hdc, pUnicodeCMapTable, dwSize, &dwSize ))

{

// possibly no Unicode cmap, not a TT font selected,...

free( pUnicodeCMapTable );

return 0;

}

cChar = GetFontFormat4CharCount( pUnicodeCMapTable );

free( pUnicodeCMapTable );

return cChar;

} /* end of function GetTTUnicodeCharCount */

USHORT GetTTUnicodeGlyphIndex (

HDC hdc, // DC with a TrueType font selected

USHORT ch // Unicode character to convert to Index

)

When the TrueType font contains a glyph for ch, the

function returns the glyph index for that character.

If an error occurs, or there is no glyph for ch, the

function will return the missing glyph index of zero.

{

LPCMAP4 pUnicodeCMapTable;

DWORD dwSize;

USHORT iSegment;

USHORT *idRangeOffset;

USHORT *idDelta;

USHORT *startCount;

USHORT GlyphIndex = 0; // Initialize to missing glyph

// How big a buffer do we need for Unicode CMAP?

GetTTUnicodeCoverage( hdc, NULL, 0, &dwSize );

pUnicodeCMapTable = (LPCMAP4)malloc( dwSize );

if (!GetTTUnicodeCoverage( hdc, pUnicodeCMapTable, dwSize, &dwSize ))

{

// Either no Unicode cmap, or some other error occurred

// like font in DC is not TT.

free( pUnicodeCMapTable );

return 0; // return missing glyph on error

}

// Find the cmap segment that has the character code.

if (!FindFormat4Segment( pUnicodeCMapTable, ch, &iSegment ))

{

free( pUnicodeCMapTable );

return 0; // ch not in cmap, return missing glyph

}

// Get pointers to the cmap data

idRangeOffset = GetIdRangeOffsetArray( (LPBYTE) pUnicodeCMapTable );

idDelta = GetIdDeltaArray( (LPBYTE) pUnicodeCMapTable );

startCount = GetStartCountArray( (LPBYTE) pUnicodeCMapTable );

// Per TT spec, if the RangeOffset is zero,

if ( idRangeOffset[iSegment] == 0)

{

// calculate the glyph index directly

GlyphIndex = (idDelta[iSegment] + ch) % 65536;

}

else

{

// otherwise, use the glyph id array to get the index

USHORT idResult; //Intermediate id calc.

idResult = *(

idRangeOffset[iSegment]/2 +

(ch - startCount[iSegment]) +

&idRangeOffset[iSegment]

); // indexing equation from TT spec

if (idResult)

// Per TT spec, nonzero means there is a glyph

GlyphIndex = (idDelta[iSegment] + idResult) % 65536;

else

// otherwise, return the missing glyph

GlyphIndex = 0;

}

free( pUnicodeCMapTable );

return GlyphIndex;

} /* end of function GetTTUnicodeGlyphIndex */

回到顶端 | 提供反馈

参考

在 Unicode 标准的详细信息，请参阅：

Unicode 协会。Unicode 标准 2.0 版。阅读，马萨诸塞州，艾迪逊 Wesley 开发人员按，1996年。ISBN 0-201-48345-9。
在 internet 上： Unicode 协会（http://www.unicode.org）

() http://www.unicode.org

有关其他信息，请单击下面的文章编号，以查看 Microsoft 知识库中相应的文章：

210341

() http://support.microsoft.com/kb/210341/EN-US/

在 Windows 95 和 Windows 98 中的信息： Unicode 支持

对于 TrueType 规范的详细信息，请参阅：

Microsoft TrueType Specifications (http://www.microsoft.com/typography/tt/tt.htm)

(http://www.microsoft.com/typography/tt/tt.htm)

Microsoft TrueType 规范 (http://www.microsoft.com/typography/tt/tt.htm)

只回答业务咨询

学习日记，兼职软件设计，软件修改，毕业设计。

本文出自学习日记，转载时请注明出处及相应链接。

本文永久链接: https://www.softwareace.cn/?p=304

« 使用Uniscribe 处理复杂文本（4）

用户态下HOOK API隐藏文件 »

发表评论取消回复

要发表评论，您必须先登录。

如何将转换为在 Windows 95 中的 TrueType 标志符号索引的 Unicode 字符代码

概要

更多信息

可能性和结束

定义

进程

实施简介

完整的源代码

参考

发表评论取消回复

分类目录

业务咨询站长

如何将转换为在 Windows 95 中的 TrueType 标志符号索引的 Unicode 字符代码

概要

更多信息

可能性和结束

定义

进程

实施简介

完整的源代码

参考

发表评论 取消回复

分类目录

业务咨询站长

发表评论取消回复