|
前页|后页| 目录|元素| 特性 SGML声明文档字符订从SGML角度来看HTML4.0文档字符集是[ISO10646] 的世界字符集(Universal Character Set,UCS).目前,它完全逐字 等价于[UNICODE] 标准.数据转换当HTML文本用UCS-2(charset="UNICODE-1-1")直接传送的时候,你 必定会关心它的位元次序:对于双位元字符,高位元是先 送还是后送?这份说明书建议UCS-2以big-endian 位元次序(先 传送高位元)传输,它同时符合确认网络位元传送规则以 及?UNICODE([UNICODE]) 建议的系列文本数据传送方式.而且,为了最大化正确解 译的机会,建议以UCS-2传送文本时以ZERO-WIDTH NON-BREAKING SPACE 字符(16进制FEFF)?开始,它在位元反转时成为FFFE,这个一 个可以保证不会被分配的字符.因此,用户代理器收到?一 个FFFE作为文本的第一个文?本的8位元时可以知道位元已 经从剩余的文本反转.[ISO10646] 的UTF-1(由IANA作为ISO-10646-UTF-1注册)变形格式,将不被使 用. SGML声明<!SGML?"ISO 8879:1986" -- ?SGML Declaration for HyperText Markup Language version 4.0 ?With support for Unicode UCS-4 and increased limits ?for tag and literal lengths etc. -- CHARSET BASESET "ISO Registration Number 177//CHARSET ISO/IEC 10646-1:1993 UCS-4 with implementation level 3//ESC 2/5 2/15 4/6" DESCSET 0 9 UNUSED 9 2 9 11?2 UNUSED 13?1 13 14?18 ?UNUSED 32?95 ?32 127 1 UNUSED 128 32 ?UNUSED 160 2147483486 160 -- In ISO 10646, the positions with hexadecimal values 0000D800 - 0000DFFF, used in the UTF-16 encoding of UCS-4, are reserved, as well as the last two code values in each plane of UCS-4, i.e. all values of the hexadecimal form xxxxFFFE or xxxxFFFF. These code values or the corresponding numeric character references must not be included when generating a new HTML document, and they should be ignored if encountered when processing a HTML document. -- CAPACITY ?SGMLREF TOTALCAP ?150000 GRPCAP ?150000 ENTCAP 150000 SCOPE ?DOCUMENT SYNTAX SHUNCHAR CONTROLS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 127 BASESET"ISO 646IRV:1991//CHARSET International Reference Version (IRV)//ESC 2/8 4/2" DESCSET?0 128 0 FUNCTION RE 13 RS 10 SPACE 32 TAB SEPCHAR 9 NAMING LCNMSTRT "" UCNMSTRT "" LCNMCHAR ".-"-- ?include "~/_" for URLs? -- UCNMCHAR ".-" NAMECASE GENERAL YES ENTITY?NO DELIM GENERAL?SGMLREF SHORTREF SGMLREF NAMES ?SGMLREF QUANTITY SGMLREF ATTSPLEN 65536 -- These are the largest values -- LITLEN 65536 -- permitted in the declaration -- NAMELEN 65536 -- Avoid fixed limits in actual -- PILEN 65536 -- implementations of HTML UA's -- TAGLVL 100 TAGLEN 65536 GRPGTCNT 150 GRPCNT 64 FEATURES MINIMIZE DATATAG NO OMITTAG YES RANK NO SHORTTAG YES LINK SIMPLE NO IMPLICIT NO EXPLICIT NO OTHER CONCUR NO SUBDOC NO FORMAL YES > |